Senior Fitness - Exercise and Nutrition for Aging Men and Women
FREE Article Feed for your website.
Home Ownership Magazine
Party Planning Information
Article Marketing Resources
Bio-Medical Research Article Database
Informative Articles on Life, Love and Happiness
Tutorials on Business to Writing
Famous Quotes from Famous People
Song Lyric Information
New US Patent Information
Comprehensive List of Content by Category
Online Auctions and Shopping Related Articles
Article Search
Most Recent Articles
 

Become Healthier Become Fitter
Category:
Health / Fitness  

Reading Your Financial Statements What Every Entrepreneur Must K...
Category:
Business  

Corporate Career Development Networking
Category:
Business  

5 Money Making Tips on How To Earn Hundreds of Dollars With Focu...
Category:
Business  

Buying Chainsaws Online
Category:
Health / Fitness  

Ditch Clutter to Tune In Your Intuitive Vision
Category:
Business  

Biofeedback
Category:
Health / Fitness  

The Right Pair of Rider s Protection
Category:
Business  

Wear the Perfect fit Helmet
Category:
Business  

Online Network Marketing A Powerful Tool for Today s Entrepreneu...
Category:
Business  

Recovery in the 21st Century Get the Facts First Since Your Life...
Category:
Health / Fitness  

What Is Restless Leg Syndrome
Category:
Health / Fitness  

Did you know that it s ok to have and make money online
Category:
Business  

The Main Causes of Acne
Category:
Health / Fitness  

Simple Steps for Starting Your Home Based Business
Category:
Business  

The proof of the pudding is in the e mail
Category:
Business  

Einstein The Universe And Leadership
Category:
Business  

Einstein The Universe And Leadership
Category:
Business  

How To Commence An Online Business
Category:
Business  

Relieve Your Dry Itchy Skin Using Natural Remedies
Category:
Health / Fitness  

Small Business Funding Reach into your own pockets
Category:
Business  

Top 3 Tips for Buying an LCD TV
Category:
Entertainment / Television  

Marketing Strategy 101
Category:
Business  

Pueraria Mirifica Builds Up The Breast Produces Hormone In Menop...
Category:
Health / Fitness  

Vision Correction Surgery Throw Away Those Eyeglasses and Enjoy ...
Category:
Health / Fitness  

Financial Incentives for Your Business to Use Solar Power
Category:
Business  

Costco s Example Can Boost Your Home Internet Business
Category:
Business  

Plasma vs LCD TV
Category:
Entertainment / Television  

The 4 Companions of Power Tools
Category:
Business  

Acne Vulgaris The Whole Truth
Category:
Health / Fitness  

Alzheimer s Disease The Epidemic of the Future
Category:
Health / Fitness  

The Focal Infection Patients must be examined Infected Teeth Gum...
Category:
Health / Fitness  

Loans can help you make money
Category:
Business  

Surveys Profits
Category:
Business  

Residential Cleaning Customers Be Prepared to Answer Their Quest...
Category:
Business  

The Remarkable Power of Thank You
Category:
Business  

Is Self Esteem Contrary to Christianity
Category:
Health / Fitness  

Financial Plan your way to success
Category:
Business  

Fast Easy Payday Loan
Category:
Business  

The Answering Service Advantage
Category:
Business  

Is Your Online Business A Hobby Or A Real Business
Category:
Business  

Addressing the Fears and Feelings Associated With Weight Loss
Category:
Health / Fitness  

The importance of using shower filters
Category:
Health / Fitness  

Selling Strategies for the Scared
Category:
Business  

What You Need to Know About Zinc Supplementation
Category:
Health / Fitness  

What to look for when buying a mobile or cell phone
Category:
Business  

Why Your Artwork Is Garbage
Category:
Business  

PayPal A Safe Secure Option for Small and Medium Businessmen
Category:
Business  

Ten Sure Fire Ways to Fail as a Manager
Category:
Business  

Secure Emergency Cash Advance Short Term Powers
Category:
Business  

Is the Limited Liability Company the Right Entity for Your Busin...
Category:
Business  

A Healthy To Do List For Your Body
Category:
Health / Fitness  

How to Get More Time and More Clients
Category:
Business  

Herbs for Backache
Category:
Health / Fitness  

How Do You Know That
Category:
Business  

What Cyber Price for a National Identity
Category:
Business  

Is Natural Acne Treatment Right For You
Category:
Health / Fitness  

Antioxidant supplements for better health
Category:
Health / Fitness  

Paper and Printing The Link
Category:
Business  

Would you like to Make Money Online
Category:
Business  

An Overview Of Purchase Protection Insurance
Category:
Business  

The Credentials of Any Good San Diego Criminal Defense Lawyer
Category:
Business  

Advertising Defined What s It Good For And How An Online Campaig...
Category:
Business  

How Search Engines Help Your Business
Category:
Business  

PPC Campaign Management Services
Category:
Business  

The Obesity Hypertension Connection Is Your Weight Putting You A...
Category:
Health / Fitness  

Eating with Intelligence
Category:
Health / Fitness  

Add Value by Documenting Your Business
Category:
Business  

Why Do Good Employee s Leave
Category:
Business  

A Manager s Guide to Managing Redundancy
Category:
Business  

It s My Body I ll Exercise It If I Have To
Category:
Health / Fitness  

Maintaining People Places Retaining Staff
Category:
Business  

Seven Steps to start dropship business
Category:
Business  

Bird Flu A Global Outbreak A Global Concern
Category:
Health / Fitness  

Key Staff can and will leave your business are you prepared
Category:
Business

Single semiconductor graphics platform Number:7,064,763 from the United States Patent and Trademark Office (PTO) owispatent

Home    Author Login    Submit Article    Article Search    Add Your Link    Edit Your Link    Contact Us    Advertising    Disclaimer

   

 
Web LinkGrinder.com

Top Breaking News
     Betancourt Healthy Following Release From Colombian Jungle by VOA News
     Violent Protests Disrupt Hungary's Gay Rights Parade by Stefan Bos
     Pernice Leads AT&T National Golf Tournament by David Byrd

Title: Single semiconductor graphics platform

Abstract: A graphics pipeline system and method are provided for graphics processing. Such system includes a transform module positioned on a single semiconductor platform for transforming graphics data from object space to screen space. Coupled to the transform module is a lighting module which is positioned on the single semiconductor platform for lighting the graphics data. Also included is a rasterizer coupled to the lighting module and positioned on the single semiconductor platform for rendering the graphics data.

Patent Number: 7,064,763 Issued on 06/20/2006 to Lindholm,   et al.


Inventors: Lindholm; John Erik (Cupertino, CA); Moy; Simon (Mountain View, CA); Dawallu; Kevin (Sunnyvale, CA); Yang; Mingjian (Sunnyvale, CA); Montrym; John (Los Altos, CA); Kirk; David B. (San Francisco, CA); Sabella; Paolo E. (Pleasanton, CA); Papakipos; Matthew N. (Palo Alto, CA); Voorhies; Douglas A. (Menlo Park, CA); Foskett; Nicholas J. (Mountain View, CA)
Assignee: NVIDIA Corporation (Santa Clara, CA)
Appl. No.: 187226
Filed: June 28, 2002


Related U.S. Patent Documents

Application NumberFiling DatePatent NumberIssue Date
09960004Sep., 2001
09730652Dec., 20006342888
09454516Dec., 19996198488

Current U.S. Class: 345/506 ; 345/426
Current International Class: G06T 17/00 (20060101)
Field of Search: 345/418,419,506,420,421,422,423,424,426,427,582


References Cited [Referenced By]

U.S. Patent Documents
5694143 December 1997 Fielder et al.
5838337 November 1998 Kimura et al.
5886701 March 1999 Chauvin et al.
5956042 September 1999 Tucker et al.
5963210 October 1999 Lewis et al.
5977997 November 1999 Vainsencher
6000027 December 1999 Pawate et al.
6014144 January 2000 Nelson et al.
6097395 August 2000 Harris et al.
6104417 August 2000 Nielsen et al.
6137497 October 2000 Strunk et al.
6163319 December 2000 Peercy et al.
6198488 March 2001 Lindholm et al.
6326964 December 2001 Snyder et al.
6342888 January 2002 Lindholm et al.
6452595 September 2002 Montrym et al.
6462737 October 2002 Lindholm et al.
Foreign Patent Documents
0690430 Jan., 1996 EP
0690430 Jul., 1996 EP
WO 98/28695 Jul., 1998 WO
WO 99/52040 Oct., 1999 WO
Primary Examiner: Bella; Matthew C.
Assistant Examiner: Vo; Cliff N.
Attorney, Agent or Firm: Cooley Godward LLP

Parent Case Text



RELATED APPLICATIONS

The present application is a continuation of an application filed Sep. 20, 2001 under Ser. No. 09/960,004; which, in turn, is a continuation of an application filed Dec. 5, 2000 under Ser. No. 09/730,652 and issued under US. Pat No. 6,342,888; which, in turn, is a continuation of an application filed on Dec. 6, 1999 under Ser. No. 09/454,516 and issued under US. Pat. No. 6,198,488. The present application is related to applications filed Sep. 20, 2001 under Ser. Nos. 09/961,228, 09/961,219, and 09/957,746. The present application is further related to applications entitled "Method, Apparatus and Article of Manufacture for Area Rasterization using Sense Points" which was filed on Dec. 6, 1999 under Ser. No. 09/455,305, "Method, Apparatus and Article of Manufacture for Boustrophedonic Rasterization" which was filed on Dec. 6, 1999 under Ser. No. 09/454,505, "Method, Apparatus and Article of Manufacture for Clip-less Rasterization using Line Equation-based Traversal" which was filed on Dec. 6, 1999 under Ser. No. 09/455,728, "Method, Apparatus and Article of Manufacture for a Vertex Attribute Buffer in a Graphics Processor" which was filed on Dec. 6, 1999 under Ser. No. 09/454,525, "Method, Apparatus and Article of Manufacture for a Transform Module in a Graphics Processor" which was filed on Dec. 6, 1999 under Ser. No. 09/456,102, "Method and Apparatus for a Lighting Module in a Graphics Processor" which was filed on Dec. 6, 1999 under Ser. No. 09/454,524, and "Method, Apparatus and Article of Manufacture for a Sequencer in a Transform/Lighting Module Capable of Processing Multiple Independent Execution Threads" which was filed on Dec. 6, 1999 under Ser. No. 09/456,104, which were filed concurrently herewith, and which are all incorporated herein by reference in their entirety.
Claims



What is claimed is:

1. A graphics pipeline system for graphics processing, comprising: a transform module being positioned on a single semiconductor platform for transforming graphics data from object space to screen space; and a lighting module coupled to the transform module and positioned on the same single semiconductor platform as the transform module for lighting the graphics data; wherein the transform module is adapted for executing multiple operations in parallel through a plurality of logic units thereof.

2. The system as recited in claim 1, wherein the logic units include a multiplication logic unit and an arithmetic logic unit.

3. The system as set forth in claim 1, wherein the single semiconductor platform operates with a Direct3D application program interface.

4. The system as set forth in claim 1, wherein the single semiconductor platform operates with an OpenGL application program interface.

5. The system as set forth in claim 1, wherein the transforming is performed utilizing an add operation and a multiply operation.

6. The system as set forth in claim 1, wherein the lighting is performed utilizing an add operation and a multiply operation.

7. The system as set forth in claim 1, wherein at least one mode bit is utilized to control the transforming at least in part.

8. The system as set forth in claim 1, wherein at least one mode bit is utilized to control the lighting at least in part.

9. The system as set forth in claim 1, wherein a fog operation is performed on the graphics data utilizing the single semiconductor platform.

10. The system as set forth in claim 1, wherein the single semiconductor platform includes a chip.

11. A method for graphics processing, comprising: transforming graphics data from a first space to a second space; lighting the graphics data; and wherein multiple operations are executed in parallel through a plurality of logic units while transforming the graphics data, and the graphics data is transformed and lighted on a single semiconductor platform.

12. The method as recited in claim 11, wherein the logic units include a multiplication logic unit and an arithmetic logic unit.

13. The method as set forth in claim 11, wherein the rendering includes 3-D rendering.

14. The method as set forth in claim 11, wherein the single semiconductor platform operates with a Direct3D application program interface.

15. The method as set forth in claim 11, wherein the single semiconductor platform operates with an OpenGL application program interface.

16. The method as set forth in claim 11, wherein the transforming is performed utilizing an add operation and a multiply operation.

17. The method as set forth in claim 11, wherein the lighting is performed utilizing an add operation and a multiply operation.

18. The method as set forth in claim 11, wherein at least one mode bit is utilized to control the transforming at least in part.

19. The method as set forth in claim 11, wherein at least one mode bit is utilized to control the lighting at least in part.

20. The method as set forth in claim 11, wherein a fog operation is performed on the graphics data utilizing the single semiconductor platform.

21. The method as set forth in claim 11, wherein blending is performed on the graphics data utilizing the single semiconductor platform.

22. The method as set forth in claim 11, wherein the single semiconductor platform includes a chip.

23. A graphics pipeline system for graphics processing, comprising: a transform module being positioned on a single semiconductor platform for transforming graphics data from object space to screen space; and a lighting module coupled to the transform module and positioned on the same single semiconductor platform as the transform module for lighting the graphics data; wherein the lighting module is adapted for executing multiple operations in parallel through a plurality of logic units thereof.

24. The system as recited in claim 23, wherein the logic units include a multiplication logic unit and an arithmetic logic unit.

25. The system as recited in claim 23, wherein the transform module and the lighting module include a multiplication logic unit and an arithmetic logic unit.

26. The system as set forth in claim 23, wherein the single semiconductor platform operates with a Direct3D application program interface.

27. The system as set forth in claim 23, wherein the single semiconductor platform operates with an OpenGL application program interface.

28. The system as set forth in claim 23, wherein the transforming is performed utilizing an add operation and a multiply operation.

29. The system as set forth in claim 23, wherein the lighting is performed utilizing an add operation and a multiply operation.

30. The system as set forth in claim 23, wherein at least one mode bit is utilized to control the transforming at least in part.

31. The system as set forth in claim 23, wherein at least one mode bit is utilized to control the lighting at least in part.

32. The system as set forth in claim 23, wherein a fog operation is performed on the graphics data utilizing the single semiconductor platform.

33. The system as set forth in claim 23, wherein the single semiconductor platform includes a chip.

34. A method for graphics processing, comprising: transforming graphics data from a first space to a second space; and lighting the graphics data; wherein multiple operations are executed in parallel through a plurality of logic units while lighting the graphics data, and the graphics data is transformed and lighted on a single semiconductor platform.

35. The method as recited in claim 34, wherein the logic units include a multiplication logic unit and an arithmetic logic unit.

36. The method as set forth in claim 34, and further comprising 3-D rendering on the single semiconductor platform.

37. The method as set forth in claim 34, wherein the single semiconductor platform operates with a Direct3D application program interface.

38. The method as set forth in claim 34, wherein the single semiconductor platform operates with an OpenGL application program interface.

39. The method as set forth in claim 34, wherein the transforming is performed utilizing an add operation and a multiply operation.

40. The method as set forth in claim 34, wherein the lighting is performed utilizing an add operation and a multiply operation.

41. The method as set forth in claim 34, wherein at least one mode bit is utilized to control the transforming at least in part.

42. The method as set forth in claim 34, wherein at least one mode bit is utilized to control the lighting at least in part.

43. The method as set forth in claim 34, wherein a fog operation is performed on the graphics data utilizing the single semiconductor platform.

44. The method as set forth in claim 34, wherein blending is performed on the graphics data utilizing the single semiconductor platform.

45. The method as set forth in claim 34, wherein the single semiconductor platform includes a chip.

46. A single semiconductor platform system, comprising: a transform module positioned on a single semiconductor platform for transforming graphics data; a lighting module positioned on the same single semiconductor platform as the transform module for lighting the graphics data; a set-up module positioned on the same single semiconductor platform as the transform module and the lighting module for setting up the graphics data; and a render module positioned on the same single semiconductor platform as the transform module, the lighting module, and the set-up module for rendering the graphics data; wherein multiple operations are executed in parallel through a plurality of logic units while transforming and lighting the graphics data.

47. A method for graphics processing utilizing a single semiconductor platform, comprising: transforming graphics data; lighting the graphics data; setting up the graphics data; and rendering the graphics data; wherein the graphics data is transformed, lighted, set up, and rendered on the single semiconductor platform; wherein multiple operations are executed through a plurality of logic units while transforming and lighting the graphics data.

48. A method for graphics processing utilizing a single semiconductor platform adapted for being coupled to a central processing unit, comprising: transforming graphics data utilizing a plurality of logic units including an addition logic unit and a multiplication logic unit; wherein at least one mode bit is utilized to control the transforming at least in part; and lighting the graphics data utilizing a plurality of logic units including an addition logic unit and a multiplication logic unit; wherein at least one mode bit is utilized to control the lighting at least in part; wherein the graphics data is transformed and lighted utilizing the single semiconductor platform; wherein the single semiconductor platform operates with an OpenGL application program interface.

49. A system, comprising: a single semiconductor platform for transforming graphics data and lighting the graphics data; wherein multiple operations are executed in parallel through a plurality of logic units while transforming the graphics data.

50. A system, comprising: a single semiconductor platform for transforming graphics data and lighting the graphics data; wherein multiple operations are executed in parallel through a plurality of logic units while lighting the graphics data.

51. A system, comprising: a single semiconductor platform for transforming graphics data and lighting the graphics data, the single semiconductor platform adapted to operate with a graphics application program interface, and in conjunction with a central processing unit; wherein multiple operations are executed in parallel through a plurality of logic units while transforming the graphics data.

52. A system, comprising: a single semiconductor platform for transforming graphics data and lighting the graphics data, the single semiconductor platform adapted to operate with a graphics application program interface, and in conjunction with a central processing unit; wherein multiple operations are executed in parallel through a plurality of logic units while lighting the graphics data.
Description



FIELD OF THE INVENTION

The present invention relates generally to graphics processors and, more particularly, to graphics pipeline systems including transform, lighting and rasterization modules.

BACKGROUND OF THE INVENTION

Three dimensional graphics are central to many applications. For example, computer aided design (CAD) has spurred growth in many industries where computer terminals, cursors, CRT's and graphics terminals are replacing pencil and paper, and computer disks and tapes are replacing drawing vaults. Most, if not all, of these industries have a great need to manipulate and display three-dimensional objects. This has lead to widespread interest and research into methods of modeling, rendering, and displaying three-dimensional objects on a computer screen or other display device. The amount of computations needed to realistically render and display a three-dimensional graphical object, however, remains quite large and true realistic display of three-dimensional objects have largely been limited to high end systems. There is, however, an ever-increasing need for inexpensive systems that can quickly and realistically render and display three dimensional objects.

One industry that has seen a tremendous amount of growth in the last few years is the computer game industry. The current generation of computer games is moving to three-dimensional graphics in an ever increasing fashion. At the same time, the speed of play is being driven faster and faster. This combination has fueled a genuine need for the rapid rendering of three-dimensional graphics in relatively inexpensive systems. In addition to gaming, this need is also fueled by e-Commerce applications, which demand increased multimedia capabilities.

Rendering and displaying three-dimensional graphics typically involves many calculations and computations. For example, to render a three dimensional object, a set of coordinate points or vertices that define the object to be rendered must be formed. Vertices can be joined to form polygons that define the surface of the object to be rendered and displayed. Once the vertices that define an object are formed, the vertices must be transformed from an object or model frame of reference to a world frame of reference and finally to two-dimensional coordinates that can be displayed on a flat display device. Along the way, vertices may be rotated, scaled, eliminated or clipped because they fall outside the viewable area, lit by various lighting schemes, colorized, and so forth. Thus the process of rendering and displaying a three-dimensional object can be computationally intensive and may involve a large number of vertices.

A general system that implements such a pipelined system is illustrated in Prior Art FIG. 1. In this system, data source 10 generates a stream of expanded vertices defining primitives. These vertices are passed one at a time, through pipelined graphic system 12 via vertex memory 13 for storage purposes. Once the expanded vertices are received from the vertex memory 13 into the pipelined graphic system 12, the vertices are transformed and lit by a transformation module 14 and a lighting module 16, respectively, and further clipped and set-up for rendering by a rasterizer 18, thus generating rendered primitives that are displayed on display device 20.

During operation, the transform module 14 may be used to perform scaling, rotation, and projection of a set of three dimensional vertices from their local or model coordinates to the two dimensional window that will be used to display the rendered object. The lighting module 16 sets the color and appearance of a vertex based on various lighting schemes, light locations, ambient light levels, materials, and so forth. The rasterization module 18 rasterizes or renders vertices that have previously been transformed and/or lit. The rasterization module 18 renders the object to a rendering target which can be a display device or intermediate hardware or software structure that in turn moves the rendered data to a display device.

When manufacturing graphics processing systems, there is a general need to increase the speed of the various graphics processing components, while minimizing costs. In general, integration is often employed to increase the speed of a system. Integration refers to the incorporation of different processing modules on a single integrated circuit. With such processing modules communicating in a microscopic semiconductor environment, as opposed to external buses, speed is vastly increased.

Integration if often limited, however, by a cost of implementing and manufacturing multiple processing modules on a single chip. In the realm of graphics processing, any attempt to integrate the transform, lighting, and rasterization modules for increased speed would be cost prohibitive. The reason for this increase in cost is that the required integrated circuit would be of a size that is simply too expensive to be feasible.

This size increase is due mainly to the complexity of the various engines. High performance transform and lighting engines alone are very intricate and are thus expensive to implement on-chip, let alone implement with any additional functionality. Further, conventional rasterizers are multifaceted with the tasks of clipping, rendering, etc. making any cost-effective attempt to combine such module with the transform and lighting modules nearly impossible.

There is therefore a need for a transform, lighting, and rasterization module having a design that allows cost-effective integration.

DISCLOSURE OF THE INVENTION

A graphics pipeline system and method are provided for graphics processing. Such system includes a transform module positioned on a single semiconductor platform for transforming graphics data from object space to screen space. Coupled to the transform module is a lighting module which is positioned on the single semiconductor platform for lighting the graphics data. Also included is a rasterizer coupled to the lighting module and positioned on the single semiconductor platform for rendering the graphics data.

BRIEF DESCRIPTION OF THE DRAWINGS

The foregoing and other aspects and advantages are better understood from the following detailed description of a preferred embodiment of the invention with reference to the drawings, in which:

FIG. 1 illustrates a prior art graphics processing system;

FIG. 1A is a flow diagram illustrating the various components of one embodiment of the present invention implemented on a single semiconductor platform;

FIG. 2 is a schematic diagram of a vertex attribute buffer (VAB) in accordance with one embodiment of the present invention;

FIG. 2A is a chart illustrating the various commands that may be received by VAB in accordance with one embodiment of the present invention;

FIG. 2B is a flow chart illustrating a method of loading and draining vertex attributes to and from VAB in accordance with one embodiment of the present invention;

FIG. 2C is a schematic diagram illustrating the architecture of the present invention employed to implement the operations of FIG. 2B;

FIG. 3 illustrates the mode bits associated with VAB in accordance with one embodiment of the present invention;

FIG. 4 illustrates the transform module of the present invention;

FIG. 4A is a flow chart illustrating a method of running multiple execution threads in accordance with one embodiment of the present invention;

FIG. 4B is a flow diagram illustrating a manner in which the method of FIG. 4A is carried out in accordance with one embodiment of the present invention;

FIG. 5 illustrates the functional units of the transform module of FIG. 4 in accordance with one embodiment of the present invention;

FIG. 6 is a schematic diagram of the multiplication logic unit (MLU) of the transform module of FIG. 5;

FIG. 7 is a schematic diagram of the arithmetic logic unit (ALU) of the transform module of FIG. 5;

FIG. 8 is a schematic diagram of the register file of the transform module of FIG. 5;

FIG. 9 is a schematic diagram of the inverse logic unit (ILU) of the transform module of FIG. 5;

FIG. 10 is a chart of the output addresses of output converter of the transform module of FIG. 5 in accordance with one embodiment of the present invention;

FIG. 11 is an illustration of the micro-code organization of the transform module of FIG. 5 in accordance with one embodiment of the present invention;

FIG. 12 is a schematic diagram of the sequencer of the transform module of FIG. 5 in accordance with one embodiment of the present invention;

FIG. 13 is a flowchart delineating the various operations associated with use of the sequencer of the transform module of FIG. 12;

FIG. 14 is a flow diagram delineating the operation of the sequencing component of the sequencer of the transform module of FIG. 12;

FIG. 14A is a flow diagram illustrating the components of the present invention employed for handling scalar and vector components during graphics-processing;

FIG. 14B is a flow diagram illustrating one possible combination 1451 of the functional components of the present invention shown in FIG. 14A which corresponds to the transform module of FIG. 5;

FIG. 14C is a flow diagram illustrating another possible combination 1453 of the functional components of the present invention shown in FIG. 14A;

FIG. 14D illustrates a method implemented by the transform module of FIG. 12 for performing a blending operation during graphics-processing in accordance with one embodiment of the present invention;

FIG. 15 is a schematic diagram of the lighting module of one embodiment of the present invention;

FIG. 16 is a schematic diagram showing the functional units of the lighting module of FIG. 15 in accordance with one embodiment of the present invention;

FIG. 17 is a schematic diagram of the multiplication logic unit (MLU) of the lighting module of FIG. 16 in accordance with one embodiment of the present invention;

FIG. 18 is a schematic diagram of the arithmetic logic unit (ALU) of the lighting module of FIG. 16 in accordance with one embodiment of the present invention;

FIG. 19 is a schematic diagram of the register unit of the lighting module of FIG. 16 in accordance with one embodiment of the present invention;

FIG. 20 is a schematic diagram of the lighting logic unit (LLU) of the lighting module of FIG. 16 in accordance with one embodiment of the present invention;

FIG. 21 is an illustration of the flag register associated with the lighting module of FIG. 16 in accordance with one embodiment of the present invention;

FIG. 22 is an illustration of the micro-code fields associated with the lighting module of FIG. 16 in accordance with one embodiment of the present invention;

FIG. 23 is a schematic diagram of the sequencer associated with the lighting module of FIG. 16 in accordance with one embodiment of the present invention;

FIG. 24 is a flowchart delineating the manner in which the sequencers of the transform and lighting modules are capable of controlling the input and output of the associated buffers in accordance with one embodiment of the present invention;

FIG. 25 is a diagram illustrating the manner in which the sequencers of the transform and lighting modules are capable of controlling the input and output of the associated buffers in accordance with the method of FIG. 24;

FIG. 25B is a schematic diagram of the various modules of the rasterizer of FIG. 1A;

FIG. 26 illustrates a schematic of the set-up module of the rasterization module of the present invention;

FIG. 26A is an illustration showing the various parameters calculated by the set-up module of the rasterizer of FIG. 26;

FIG. 27 is a flowchart illustrating a method of the present invention associated with the set-up and traversal modules of the rasterizer component shown in FIG. 26;

FIG. 27A illustrates sense points that enclose a convex region that is moved to identify an area in a primitive in accordance with one embodiment of the present invention;

FIG. 28 is a flowchart illustrating a process of the present invention associated with the process row operation 2706 of FIG. 27;

FIG. 28A is an illustration of the sequence in which the convex region of the present invention is moved about the primitive;

FIG. 28B illustrates another example of the sequence in which the convex region of the present invention is moved about the primitive;

FIG. 29 is a flowchart illustrating an alternate boustrophedonic process of the present invention associated with the process row operation 2706 of FIG. 27;

FIG. 29A is an illustration of the sequence in which the convex region of the present invention is moved about the primitive in accordance with the boustrophedonic process of FIG. 29;

FIG. 30 is a flowchart illustrating an alternate boustrophedonic process using boundaries;

FIG. 31 is a flowchart showing the process associated with operation 3006 of FIG. 30;

FIG. 31A is an illustration of the sequence in which the convex region of the present invention is moved about the primitive in accordance with the boundary-based boustrophedonic process of FIGS. 30 and 31;

FIG. 32 is a flowchart showing the process associated with operation 2702 of FIG. 27;

FIG. 32A is an illustration showing which area is drawn if no negative W-values are calculated in the process of FIG. 32;

FIG. 32B is an illustration showing which area is drawn if only one negative W-value is calculated in the process of FIG. 32; and

FIG. 32C is an illustration showing which area is drawn if only two negative W-values are calculated in the process of FIG. 32.

DESCRIPTION OF THE PREFERRED EMBODIMENTS

FIG. 1 shows the prior art. FIGS. 1A-32C show a graphics pipeline system of the present invention.

FIG. 1A is a flow diagram illustrating the various components of one embodiment of the present invention. As shown, the present invention is divided into four main modules including a vertex attribute buffer (VAB) 50, a transform module 52, a lighting module 54, and a rasterization module 56 with a set-up module 57. In one embodiment, each of the foregoing modules is situated on a single semiconductor platform in a manner that will be described hereinafter in greater detail. In the present description, the single semiconductor platform may refer to a sole unitary semiconductor-based integrated circuit or chip.

The VAB 50 is included for gathering and maintaining a plurality of vertex attribute states such as position, normal, colors, texture coordinates, etc. Completed vertices are processed by the transform module 52 and then sent to the lighting module 54. The transform module 52 generates vectors for the lighting module 54 to light. The output of the lighting module 54 is screen space data suitable for the set-up module which, in turn, sets up primitives. Thereafter, rasterization module 56 carries out rasterization of the primitives. It should be noted that the transform and lighting modules 52 and 54 might only stall on the command level such that a command is always finished once started.

In one embodiment, the present invention includes a hardware implementation that at least partially employs Open Graphics Library (OpenGL.RTM.) and D3D.TM. transform and lighting pipelines. OpenGL.RTM. is the computer industry's standard application program interface (API) for defining 2-D and 3-D graphic images. With OpenGL.RTM., an application can create the same effects in any operating system using any OpenGL.RTM.-adhering graphics adapter. OpenGL.RTM. specifies a set of commands or immediately executed functions. Each command directs a drawing action or causes special effects.

FIG. 2 is a schematic diagram of VAB 50 in accordance with one embodiment of the present invention. As shown, VAB 50 passes command bits 200 while storing data bits 204 representative of attributes of a vertex and mode bits 202. In use VAB 50 receives the data bits 204 of vertices and drains the same.

The VAB 50 is adapted for receiving and storing a plurality of possible vertex attribute states via the data bits 204. In use after such data bits 204, or vertex data, is received and stored in VAB 50, the vertex data is outputted from VAB 50 to a graphics-processing module, namely the transform module 52. Further, the command bits 200 are passed by VAB 50 for determining a manner in which the vertex data is inputted to VAB 50 in addition to other processing which will be described in greater detail with reference to FIG. 2A. Such command bits 200 are received from a command bit source such as a microcontroller, CPU, data source or any other type of source which is capable of generating command bits 200.

Further, mode bits 202 are passed which are indicative of the status of a plurality of modes of process operations. As such, mode bits 202 are adapted for determining a manner in which the vertex data is processed in the subsequent graphics-processing modules. Such mode bits 202 are received from a command bit source such as a microcontroller, CPU, data source or any other type of source which is capable of generating mode bits 202.

It should be noted that the various functions associated with VAB 50 may be governed by way of dedicated hardware, software or any other type of logic. In various embodiments, 64, 128, 256 or any other number of mode bits 202 may be employed.

The VAB 50 also functions as a gathering point for the 64 bit data that needs to be converted into a 128-bit format. The VAB 50 input is 64 bits/cycle and the output is 128 bits/cycle. In other embodiments, VAB 50 may function as a gathering point for 128-bit data, and VAB 50 input may be 128 bits/cycle or any other combination. The VAB 50 further has reserved slots for a plurality of vertex attributes that are all IEEE 32 bit floats. The number of such slots may vary per the desires of the user. Table 1 illustrates exemplary vertex attributes employed by the present invention.

TABLE-US-00001 TABLE 1 Position: x, y, z, w Diffuse Color: r, g, b, a Specular Color: r, g, b Fog: f Texture0: s, t, r, q Texture1: s, t, r, q Normal: nx, ny, nz Skin Weight: w

During operation, VAB 50 may operate assuming that the x,y data pair is written before the z,w data pair since this allows for defaulting the z,w pair to (0.0,1.0) at the time of the x,y write. This may be important for default components in OpenGL.RTM. and D3D.TM.. It should be noted that the position, texture0, and texture1 slots default the third and fourth components to (0.0,1.0). Further, the diffuse color slot defaults the fourth component to (1.0) and the texture slots default the second component to (0.0).

The VAB 50 includes still another slot 205 used for assembling the data bits 204 that may be passed into or through the transform and lighting module 52 and 54, respectively, without disturbing the data bits 204. The data bits 204 in the slot 205 can be in a floating point or integer format. As mentioned earlier, the data bits 204 of each vertex has an associated set of mode bits 202 representative of the modes affecting the processing of the data bits 204. These mode bits 202 are passed with the data bits 204 through the transform and lighting modules 52 and 54, respectively, for purposes that will be set forth hereinafter in greater detail.

In one embodiment, there may be 18 valid VAB, transform, and lighting commands received by VAB 50. FIG. 2A is a chart illustrating the various commands that may be received by VAB 50 in accordance with one embodiment of the present invention. It should be understood that all load and read context commands, and the passthrough command shown in the chart of FIG. 2A transfer one data word of up to 128 bits or any other size.

Each command of FIG. 2A may contain control information dictating whether each set of data bits 204 is to be written into a high double word or low double word of one VAB address. In addition, a 2-bit write mask may be employed for providing control to the word level. Further, there may be a launch bit that informs VAB controller that all of the data bits 204 are present for a current command to be executed.

Each command has an associated stall field that allows a look-up to find information on whether the command is a read command in that it reads context memory or is a write command in that it writes context memory. By using the stall field of currently executing commands, the new command may be either held off in case of conflict or allowed to proceed.

In operation, VAB 50 can accept one input data word up to 128 bits (or any other size) per cycle and output one data word up to 128 bits (or any other size) per cycle. For the load commands, this means that it may take two cycles to load the data into VAB 50 to create a 128-bit quad-word and one cycle to drain it. For the scalar memories in the lighting module 54, it is not necessary to accumulate a full quad-word, and these can be loaded in one cycle/address. For one vertex, it can take up to 14 cycles to load the 7 VAB slots while it only takes 7 cycles to drain them. It should be noted, however, that it is only necessary to update the vertex state that changes between executing vertex commands. This means that, in one case, the vertex position may be updated taking 2 cycles, while the draining of the vertex data takes 7 cycles. It should be noted that only 1 cycle may be required in the case of the x,y position.

FIG. 2B is a flow chart illustrating one method of loading and draining vertex attributes to and from VAB 50 during graphics-processing. Initially, in operation 210, at least one set of vertex attributes is received in VAB 50 for being processed. As mentioned earlier, each set of vertex attributes may be unique, and correspond to a single vertex.

In use the vertex attributes are stored in VAB 50 upon the receipt thereof in operation 212. Further, each set of stored vertex attributes is transferred to a corresponding one of a plurality of input buffers of the transform module 52. The received set of vertex attributes is also monitored in order to determine whether a received vertex attribute has a corresponding vertex attribute of a different set currently stored in VAB 50, as indicated in operation 216.

Upon it being determined that a stored vertex attribute corresponds to the received vertex attribute in decision 217, the stored vertex attribute is outputted to the corresponding input buffer of the transform module 52 out of order. See operation 218. Immediately upon the stored vertex attribute being outputted, the corresponding incoming vertex attribute may take its place in VAB 50. If no correspondence is found, however, each set of the stored vertex attributes may be transferred to the corresponding input buffer of the transform module 52 in accordance with a regular predetermined sequence. Note operation 219.

It should be noted that the stored vertex attribute might not be transferred in the aforementioned manner if it has an associated launch command. Further, in order for the foregoing method to work properly, the bandwidth of an output of VAB 50 must be at least the bandwidth of an input of VAB 50.

FIG. 2C is a schematic diagram illustrating the architecture of the present invention employed to implement the operations of FIG. 2B. As shown, VAB 50 has a write data terminal WD, a read data terminal RD, a write address terminal WA, and a read address RA terminal. The read data terminal is coupled to a first clock-controlled buffer 230 for outputting the data bits 204 from VAB 50.

Also included is a first multiplexer 232 having an output coupled to the read address terminal of VAB 50 and a second clock-controlled buffer 234. A first input of the first multiplexer 232 is coupled to the write address terminal of VAB 50 while a second input of the first multiplexer 232 is coupled to an output of a second multiplexer 236. A logic module 238 is coupled between the first and second multiplexers 232 and 236, the write address terminal of VAB 50, and an output of the second clock-controlled buffer 234.

In use the logic module 238 serves to determine whether an incoming vertex attribute is pending to drain in VAB 50. In one embodiment, this determination may be facilitated by monitoring a bit register that indicates whether a vertex attribute is pending or not. If it is determined that the incoming vertex attribute does have a match currently in VAB 50, the logic module 238 controls the first multiplexer 232 in order to drain the matching vertex attribute so that the incoming vertex attribute may be immediately stored in its place. On the other hand, if it is determined that the incoming vertex attribute does not have a match currently in VAB 50, the logic module 238 controls the first multiplexer 232 such that VAB 50 is drained and the incoming vertex attribute is loaded sequentially or in some other predetermined order, per the input of the second multiplexer 236 which may be updated by the logic module 238.

As a result, there is no requirement for VAB 50 to drain multiple vertex attributes before a new incoming vertex attribute may be loaded. The pending vertex attribute forces out the corresponding VAB counterpart if possible, thus allowing it to proceed. As a result, VAB 50 can drain in an arbitrary order. Without this capability, it would take 7 cycles to drain VAB 50 and possibly 14 more cycles to load it. By overlapping the loading and draining, higher performance is achieved. It should be noted that this is only possible if an input buffer is empty and VAB 50 can drain into input buffers of the transform module 52.

FIG. 3 illustrates the mode bits associated with VAB 50 in accordance with one embodiment of the present invention. The transform/light mode information is stored in a register via mode bits 202. Mode bits 202 are used to drive the sequencers of the transform module 52 and lighting module 54 in a manner that will be become apparent hereinafter. Each vertex has associated mode bits 202 that may be unique, and can therefore execute a specifically tailored program sequence. While, mode bits 202 may generally map directly to the graphics API, some of them may be derived.

In one embodiment, the active light bits (LIS) of FIG. 3 may be contiguous. Further, the pass-through bit (VPAS) is unique in that when it is turned on, the vertex data is passed through with scale and bias, and no transforms or lighting is done. Possible mode bits 202 used when VPAS is true are the texture divide bits (TDV0,1), and foggen bits (used to extract fog value in D3D.TM.). VPAS is thus used for pre-transformed data, and TDV0,1 are used to deal with a cylindrical wrap mode in the context of D3D.TM..

FIG. 4 illustrates the transform module of one embodiment of the present invention. As shown, the transform module 52 is connected to VAB 50 by way of 6 input buffers 400. In one embodiment, each input buffer 400 might be 7*128b in size. The 6 input buffers 400 each is capable of storing 7 quad words. Such input buffers 400 follow the same layout as VAB 50, except that the pass data is overlapped with the position data.

In one embodiment, a bit might be designated for each attribute of each input buffer 400 to indicate whether data has changed since the previous instance that the input buffer 400 was loaded. By this design, each input buffer 400 might be loaded only with changed data.

The transform module 52 is further connected to 6 output vertex buffers 402 in the lighting module 54. The output buffers include a first buffer 404, a second buffer 406, and a third buffer 408. As will become apparent hereinafter, the contents, i.e. position, texture coordinate data, etc., of the third buffer 408 are not used in the lighting module 54. The first buffer 404 and second buffer 406 are both, however, used for inputting lighting and color data to the lighting module 54. Two buffers are employed since the lighting module is adapted to handle two read inputs. It should be noted that the data might be arranged so as to avoid any problems with read conflicts, etc.

Further coupled to the transform module 52 is context memory 410 and micro-code ROM memory 412. The transform module 52 serves to convert object space vertex data into screen space, and to generate any vectors required by the lighting module 54. The transform module 52 also does processes skinning and texture coordinates. In one embodiment, the transform module 52 might be a 128-bit design processing 4 floats in parallel, and might be optimized for doing 4 term dot products.

FIG. 4A is a flow chart illustrating a method of executing multiple threads in the transform module 52 in accordance with one embodiment of the present invention. In operation, the transform module 52 is capable of processing 3 vertices in parallel via interleaving. To this end, 3 commands can be simultaneously executed in parallel unless there are stall conditions between the commands such as writing and subsequently reading from the context memory 410. The 3 execution threads are independent of each other and can be any command since all vertices contain unique corresponding mode bits 202.

As shown in FIG. 4A, the method of executing multiple threads includes determining a current thread to be executed in operation 420. This determination might be made by identifying a number of cycles that a graphics-processing module requires for completion of an operation, and tracking the cycles. By tracking the cycles, each thread can be assigned to a cycle, thus allowing determination of the current thread based on the current cycle. It should be noted, however, that such determination might be made in any desired manner that is deemed effective.

Next, in operation 422, an instruction associated with a thread to be executed during a current cycle is retrieved using a corresponding program counter number. Thereafter, the instruction is executed on the graphics-processing module in operation 424.

In one example of use, the instant method includes first accessing a first instruction, or code segment, per a first program counter. As mentioned earlier, such program counter is associated with a first execution thread. Next, the first code segment is executed in the graphics-processing module. As will soon become apparent, such graphics-processing module might take the form of an adder, a multiplier, or any other functional unit or combination thereof.

Since the graphics-processing module requires more than one clock cycle to complete the execution, a second code segment might be accessed per a second program counter immediately one clock cycle after the execution of the first code segment. The second program counter is associated with a second execution thread, wherein each of the execution threads process a unique vertex.

To this end, the second code segment might begin execution in the graphics-processing module prior to the completion of the execution of the first code segment in the graphics-processing module. In use the graphics-processing module requires a predetermined number of cycles for every thread to generate an output. Thus, the various steps of the present example might be repeated for every predetermined number of cycles.

This technique offers numerous advantages over the prior art. Of course, the functional units of the present invention are used more efficiently. Further, the governing code might be written more efficiently when the multiple threading scheme is assumed to be used.

For example, in the case where the graphics-processing module includes a multiplier that requires three clock cycles to output an answer, it would be necessary to include two no operation commands between subsequent operations such as a=b*c and d=e*a, since "a" would not be available until after the three clock cycles. In the present embodiment, however, the code might simply call d=e*a immediately subsequent a=b*c, because it can be assumed that such code will be executed as one of three execution threads that are called once every three clock cycles.

FIG. 4B is a flow diagram illustrating a manner in which the method of FIG. 4A is carried out. As shown, each execution thread has an associated program counter 450 that is used to access instructions, or code segments, in instruction memory 452. Such instructions might then be used to operate a graphics-processing module such as an adder 456, a multiplier 454, and/or an inverse logic unit or register 459.

In order to accommodate a situation where at least two of the foregoing processing modules are used in tandem, at least one code segment delay 457 is employed between the graphics-processing modules. In the case where a three-thread framework is employed, a three-clock cycle code segment delay 457 is used. In one embodiment, the code segment delay 457 is used when a multiplication instruction is followed by an addition instruction. In such case, the addition instruction is not executed until three clock cycles after the execution of the multiplication instruction in order to ensure that time has elapsed which is sufficient for the multiplier 456 to generate an output.

After the execution of each instruction, the program counter 450 of the current execution thread is updated and the program counter of the next execution thread is called by module 458 in a round robin sequence to access an associated instruction. It should be noted that the program counters might be used in any fashion including, but not limited to incrementing, jumping, calling and returning, performing a table jump, and/or dispatching. Dispatching refers to determining a starting point of code segment execution based on a received parameter. Further, it important to understand that the principles associated with the present multiple thread execution framework might also be applied to the lighting module 54 of the graphics-processing pipeline of the present invention.

In the case where a three-thread framework is employed, each thread is allocated one input buffer and one output buffer at any one time. This allows loading of three more commands with data while processing three commands. The input buffers and output buffers are assigned in a round robin sequence in a manner that will be discussed later with reference to FIGS. 27 and 28.

The execution threads are thus temporally and functionally interleaved. This means that each function unit is pipelined into three stages and each thread occupies one stage at any one time. In one embodiment, the three-threads might be set to always execute in the same sequence, i.e. zero then one then three. Conceptually, the threads enter a function unit at t=clock modulo three. Once a function unit starts work, it takes three cycles to deliver the result (except the ILU that takes six), at which time the same thread is again active.

FIG. 5 illustrates the functional units of the transform module 52 of FIG. 4 in accordance with one embodiment of the present invention. As shown, included are input buffers 400 that are adapted for being coupled to VAB 50 for receiving vertex data therefrom.

A memory logic unit (MLU) 500 has a first input coupled to an output of input buffers 400. As an option, the output of MLU 500 might have a feedback loop 502 coupled to the first input thereof.

Also provided is an arithmetic logic unit (ALU) 504 having a first input coupled to an output of MLU 500. The output of ALU 504 further has a feedback loop 506 connected to the second input thereof. Such feedback loop 502 may further have a delay 508 coupled thereto. Coupled to an output of ALU 504 is an input of a register unit 510. It should be noted that the output of register unit 510 is coupled to the first and second inputs of MLU 500.

An inverse logic unit (ILU) 512 is provided including an input coupled to the output of ALU 504 for performing an inverse or an inverse square root operation. In an alternate embodiment, ILU 512 might include an input coupled to the output of register unit 510.

Further included is a conversion, or smearing, module 514 coupled between an output of ILU 512 and a second input of MLU 500. In use the conversion module 514 serves to convert scalar vertex data to vector vertex data. This is accomplished by multiplying the scalar data by a vector so that the vector operators such as the multiplier and/or adder may process it. For example, a scalar A, after conversion, may become a vector (A,A,A,A). In an alternate embodiment, the smearing module 514 might be incorporated into the multiplexers associated with MLU 500, or any other component of the present invention. As an option, a register 516 might be coupled between the output of ILU 512 and an input of the conversion unit 514. Further, such register 516 might be threaded.

Memory 410 is coupled to the second input of MLU 500 and the output of ALU 504. In particular, memory 410 has a read terminal coupled to the second input of MLU 500. Further, memory 410 has a write terminal coupled to the output of ALU 504.

The memory 410 has stored therein a plurality of constants and variables for being used in conjunction with the input buffer 400, MLU 500, ALU 504, register unit 510, ILU 512, and the conversion module 514 for processing the vertex data. Such processing might include transforming object space vertex data into screen space vertex data, generating vectors, etc.

Finally, an output converter 518 is coupled to the output of ALU 504. The output converter 518 serves for being coupled to a lighting module 54 via output buffers 402 to output the processed vertex data thereto. All data paths except for the ILU might be designed to be 128 bits wide or other data path widths may be used.

FIG. 6 is a schematic diagram of MLU 500 of the transform module 52 of FIG. 5 in accordance with one embodiment of the present invention. As shown, MLU 500 of the transform module 52 includes four multipliers 600 that are coupled in parallel.

MLU 500 of transform module 52 is capable of multiplying two four component vectors in three different ways, or pass one four component vector. MLU 500 is capable of performing multiple operations. Table 2 illustrates such operations associated with MLU 500 of transform module 52.

TABLE-US-00002 TABLE 2 CMLU_MULT o[0] = a[0]*b[0],o[1] = a[1]*b[1],o[2] = a[2]*b[2],o[3] = a[3]*b[3] CMLU_MULA o[0] = a[0]*b[0],o[1] = a[1]*b[1],o[2] = a[2]*b[2],o[3] = a[3] CMLU_MULB o[0] = a[0]*b[0],o[1] = a[1]*b[1],o[2] = a[2]*b[2],o[3] = b[3] CMLU_PASA o[0] = a[0],o[1] = a[1],o[2] = a[2],o[3] = a[3] CMLU_PASB o[0] = b[0],o[1] = b[1],o[2] = b[2],o[3] = b[3]

Possible A and B inputs are shown in Table 3.

TABLE-US-00003 TABLE 3 MA_M MLU MA_V Input Buffer MA_R RLU (shared with MB_R) MB_I ILU MB_C Context Memory MB_R RLU (shared with MA_R)

Table 4 illustrates a vector rotate option capable of being used for cross products.

TABLE-US-00004 TABLE 4 MR_NONE No change MR_ALBR Rotate A[XYZ] vector left, B[XYZ] vector right MR_ARBL Rotate A[XYZ] vector right, B[XYZ] vector left

FIG. 7 is a schematic diagram of ALU 504 of transform module 52 of FIG. 5 in accordance with one embodiment of the present invention. As shown, ALU 504 of transform module 52 includes three adders 700 coupled in parallel/series. In use ALU 504 of transform module 52 can add two three component vectors, pass one four component vector, or smear a vector component across the output. Table 5 illustrates various operations of which ALU 504 of transform module 52 is capable.

TABLE-US-00005 TABLE 5 CALU_ADDA o[0] = a[0]+b[0],o[1] = a[1]+b[1],o[2] = a[2]+b[2],o[3] = a[3] CALU_ADDB o[0] = a[0]+b[0],o[1] = a[1]+b[1],o[2] = a[2]+b[2],o[3] = b[3] CALU_SUM3B o[0123] = b[0] + b[1] + b[2] CALU_SUM4B o[0123] = b[0] + b[1] + b[2] + b[3] CALU_SMRB0 o[0123] = b[0] CALU_SMRB1 o[0123] = b[1] CALU_SMRB2 o[0123] = b[2] CALU_SMRB3 o[0123] = b[3] CALU_PASA o[0] = a[0],o[1] = a[1],o[2] = a[2],o[3] = a[3] CALU_PASB o[0] = b[0],o[1] = b[1],o[2] = b[2],o[3] = b[3]

Table 6 illustrates the A and B inputs of ALU 504 of transform module 52.

TABLE-US-00006 TABLE 6 AA_A ALU (one instruction delay) AA_C Context Memory AB_M MLU

It is also possible to modify the sign bits of the A and B input by effecting no change, negation of B, negation of A, absolute value A,B. It should be noted that when ALU 504 outputs scalar vertex data, this scalar vertex data is smeared across the output in the sense that each output represents the scalar vertex data. The pass control signals of MLU 500 and ALU 504 are each capable of disabling all special value handling during operation.

FIG. 8 is a schematic diagram of the vector register file 510 of transform module 52 of FIG. 5 in accordance with one embodiment of the present invention. As shown, the vector register file 510 includes four sets of registers 800 each having an output connected to a first input of a corresponding multiplexer 802 and an input coupled to a second input of the corresponding multiplexer 802.

In one embodiment of the present invention, the vector register file 510 is threaded. That is, there are three copies of the vector register file 510


Free Web Sudoku Puzzles.
Solve with your browser.
      1   8   7  
                1
3 5   4       9  
    9     4 5    
        6        
    3 9     2    
  9       1   8 4
6                
  2   8   5      
What is it?



Add Your Site · Terms Of Service · Privacy Policy


DISCLAIMER
Linkgrinder is a free service that searches the Internet and indexes all files found so that you may search quickly and easily for shared files. These files are created and made available individually by users whose identity we are not aware of and who we have no control over. In essence we function like a search engine tool; these files ARE NOT STORED OR SERVED BY OUR NETWORK. We are not responsible for any materials obtained by using our service. We do not monitor any of the contents of these files. These files may contain viruses, illegal materials, materials inappropriate for minors, offensive files and the like. BY USING OUR SERVICE, YOU ASSUME FULL RESPONSIBILITY FOR DOWNLOADING THESE MATERIALS AND WILL INDEMNIFY US FOR ANY DAMAGES THAT MAY BE INCURRED.

For More Specific Information VIEW OUR TERMS OF SERVICE.

Thank you and Enjoy!