Senior Fitness - Exercise and Nutrition for Aging Men and Women
FREE Article Feed for your website.
Home Ownership Magazine
Party Planning Information
Article Marketing Resources
Bio-Medical Research Article Database
Informative Articles on Life, Love and Happiness
Tutorials on Business to Writing
Famous Quotes from Famous People
Song Lyric Information
New US Patent Information
Comprehensive List of Content by Category
Online Auctions and Shopping Related Articles
Article Search
Most Recent Articles
 

Increase your Sales Lead List using Myspace
Category:
Business  

Refinancing the Responsible Way Ways to avoid Predatory Lending ...
Category:
Business  

Are you on the first page in the search engines
Category:
Marketing  

A New Test To Save Potential Heart Attack
Category:
Health / Fitness  

Adding Audio To Your Webpages In Seconds
Category:
Marketing  

How Can I Tell My Partner I Don t Like The Way He Makes Love To ...
Category:
Home And Family  

Multi Disciplinary Approach to Fibromyalgia Treatment
Category:
Health / Fitness  

How To Become A Wealthy Piano Teacher
Category:
Business  

Building a Home Theater using Green Glue or Mass loaded vinyl
Category:
Hobbies / Pastimes  

how alcohol affects the brain
Category:
Health / Fitness  

3 Simple Things
Category:
Business  

What food caused heartburn
Category:
Health / Fitness  

Is Botox Right For You
Category:
Health / Fitness  

Why on Earth are You STILL Cold Calling
Category:
Business  

Stop smoking the essential resource that any smoker needs to sto...
Category:
Health / Fitness  

Healthy Weight Loss
Category:
Health / Fitness  

How to find a qualified Property Agent
Category:
Business  

McDonald s CEO Greenberg Urges McFamily To Stand Tall
Category:
Business  

Learn How To Capture New Business For Your Award Shop
Category:
Marketing  

The Profound Fitness Manifesto Part V Test Track Tweak
Category:
Health / Fitness  

Choosing The Perfect Area Rug For Your Home
Category:
Home And Family  

Ready for the Ashes 2006 2007
Category:
Entertainment / Television  

LASIK Surgery How The Excimer Laser Works
Category:
Health / Fitness  

High Blood Pressure in Children
Category:
Health / Fitness  

Collecting Diecast Vehicles is a fun hobby for folks of all ages...
Category:
Hobbies / Pastimes  

Conservatories and Building Regulations
Category:
Home And Family  

Picking the Perfect Hawaiian Vacation Package on Kauai Maui and ...
Category:
Travel  

The Seven Secrets of Great Customer Service
Category:
Business  

Add Years to Your Life
Category:
Health / Fitness  

How to Get Radio Interviews to Promote Your Business
Category:
Marketing  

Depression and Anxiety
Category:
Health / Fitness  

Driving Traffic to Your Blog Part One
Category:
Business  

A Guide to Buying Children s Toys
Category:
Home And Family  

The Top 10 Ways To Improve Your Interview Body Language Part Two...
Category:
Business  

Persuasive Techniques You Could Use To Get What You Want In Your...
Category:
Home And Family  

How Do You Know That
Category:
Business  

How To Make Your Own Baby Cosmetics
Category:
Home And Family  

Home Improvement Ideas and Tips
Category:
Business  

Consolidating Credit Cards How to Effectively Use Balance Transf...
Category:
Finance / Investment  

Golf In Lofoten is a Mystical Experience
Category:
Sports  

What Makes YOU So Special An Exercise in Differentiation
Category:
Business  

What Are The Ultimate Bridal Gifts
Category:
Home And Family  

Who Else Wants To Make Money With Adsense
Category:
Marketing  

Natural Isn t Always Best Buyer Beware
Category:
Health / Fitness  

Adverse Credit Remortgage Refinance at Better Terms
Category:
Finance / Investment  

The Simplest And Easiest Way To Position Your Business Ahead Of ...
Category:
Marketing  

Franchising Pros And Cons
Category:
Business  

Africa Is Turning Mobile
Category:
Business  

Natural Hair Care Products Beginners Guide
Category:
Health / Fitness  

Types of Acne Treatment Solutions For Your Skin
Category:
Health / Fitness  

Using Teleseminars to help your congregation listen to you while...
Category:
Self Help  

Master the art of Debt management
Category:
Finance / Investment  

Learn Piano Online
Category:
Hobbies / Pastimes  

CHRISTMAS HOT JOBS
Category:
Business  

I Pulled A Fast One on My Obese Husband Now He s Skinny and Lovi...
Category:
Health / Fitness  

Mother Nature Rules
Category:
Health / Fitness  

Re Visioning
Category:
Self Help  

Everyone Needs A Spaghetti Garden
Category:
Home And Family  

Free Teleseminar Is Showing Thousands How To Make 500 A Day
Category:
Business  

The Emotional Effects of Acne
Category:
Health / Fitness  

Indie Music Your big break may be just around the corner find ou...
Category:
Entertainment / Television  

An Effective Way for Getting Rid of Mosquitos
Category:
Health / Fitness  

Do I Need Medical Treatment for Menopause
Category:
Health / Fitness  

Dichotomy of Preference
Category:
Self Help  

Golf Workouts For Winter
Category:
Sports  

Coin Collecting Was Easier When I Was Younger
Category:
Home And Family  

Choosing Dog Tags for Your Canine
Category:
Pets  

How Alcohol Affects Panic Attacks
Category:
Health / Fitness  

Can You Afford To Live On Organic Food
Category:
Health / Fitness  

Jargon Buster Finance in Plain English
Category:
Finance / Investment  

The Top 10 Countdown to a Flu less Holiday
Category:
Health / Fitness  

Secure Emergency Cash Advance Short Term Powers
Category:
Business  

Euro Pounds Currency markets property Costa Del Sol Spain August...
Category:
Business  

The Easy Way To Avoid Discouragement And Confusion While Trainin...
Category:
Pets  

The Complete Solution Of Troubled Teens
Category:
Health / Fitness

Multimedia compression system with additive temporal layers Number:7,082,164 from the United States Patent and Trademark Office (PTO) owispatent

Home    Author Login    Submit Article    Article Search    Add Your Link    Edit Your Link    Contact Us    Advertising    Disclaimer

   

 
Web LinkGrinder.com

Top Breaking News
     Georgia's Parliament Urges Breaking Diplomatic Ties With Russia by Peter Heinlein
     Darfur Rebels Claim to Shoot Down Spy Plane by VOA News
     Obama Prepares to Formally Accept Democratic Party Presidential Nomination by VOA News

Title: Multimedia compression system with additive temporal layers

Abstract: A multimedia compression system for generating frame rate scaleable data in the case of universally scaleable data. Universally scaleable data is scaleable across all of the relevant characteristics of the data (e.g., frame rate, resolution, and quality for video). The scaleable data generated by the compression system includes multiple additive layers for each characteristic across which the data is scaleable. For video, the frame rate layers are additive temporal layers, the resolution layers are additive base and enhancement layers, and the quality layers are additive index planes of embedded codes. Various techniques can be used for generating these layers (e.g., Laplacian pyramid decomposition or wavelet decomposition for generating the resolution layers; tree structured vector quantization or tree structured scalar quantization for generating the quality layers). The system further provides for embedded inter-frame compression in the context of frame rate scalability, and non-redundant layered multicast network delivery of the scaleable data.

Patent Number: 7,082,164 Issued on 07/25/2006 to Chaddha


Inventors: Chaddha; Navin (Sunnyvale, CA)
Assignee: Microsoft Corporation (Redmond, WA)
Appl. No.: 10/151,455
Filed: May 20, 2002


Related U.S. Patent Documents

Application NumberFiling DatePatent NumberIssue Date
08888422Jul., 19976392705
08714447Mar., 1997

Current U.S. Class: 375/240.12 ; 382/238
Current International Class: H04N 7/12 (20060101)
Field of Search: 375/240.12,240.11,240.14,240.16,240.21,240.24,240.22 382/253,238 348/388.1 341/200


References Cited [Referenced By]

U.S. Patent Documents
4816914 March 1989 Ericsson
5144425 September 1992 Joseph
5194950 March 1993 Murakami et al.
5231599 July 1993 Peters et al.
5235419 August 1993 Krause
5331637 July 1994 Francis
5349383 September 1994 Parke
5367385 November 1994 Yuan
5396497 March 1995 Veltman
5418568 May 1995 Keith
5418571 May 1995 Ghanbari
5426462 June 1995 Bui
5481543 January 1996 Veltman
5487167 January 1996 Dinallo
5502727 March 1996 Catanzaro
5510834 April 1996 Weiss
5512938 April 1996 Ohno
5517494 May 1996 Green
5521630 May 1996 Chen
5521918 May 1996 Kim
5530484 June 1996 Bhatt
5557749 September 1996 Norris
5560038 September 1996 Haddock
5574911 November 1996 D'Angelo
5577258 November 1996 Cruz
5583652 December 1996 Ware
5585852 December 1996 Agarwal
5592228 January 1997 Dachiku
5594911 January 1997 Cruz
5596493 January 1997 Tone
5604867 February 1997 Harwood
5621660 April 1997 Chaddha et al.
5649030 July 1997 Normile et al.
5664044 September 1997 Ware
5673265 September 1997 Gupta et al.
5694173 December 1997 Kimura et al.
5708473 January 1998 Mead
5731840 March 1998 Kikuchi
5742343 April 1998 Haskell
5745379 April 1998 Lewis
5757306 May 1998 Nomura
5758194 May 1998 Kuzma
5768533 June 1998 Ran
5768535 June 1998 Chaddha
5784572 July 1998 Rostaker
5796434 August 1998 Lempel
5832229 November 1998 Tomoda
5844613 December 1998 Chaddha
5852565 December 1998 Demos
5859667 January 1999 Kondo et al.
5864366 January 1999 Yeo
5874986 February 1999 Gibbon
5884004 March 1999 Sato
5886733 March 1999 Zdepski
5898686 April 1999 Virgile
5907360 May 1999 Kessler et al.
5926226 July 1999 Proctor et al.
5946316 August 1999 Chen et al.
6084908 July 2000 Chiang et al.
6157656 December 2000 Lindgren et al.
6160846 December 2000 Chiang et al.
6233017 May 2001 Chaddha
6337881 January 2002 Chaddha
6392705 May 2002 Chaddha
6564262 May 2003 Chaddha

Other References

Hung, Andy C. et al., "Error Resilient Pyramid Vector Quantization for Image Compression," Proceedings of 1st Int'l. Conference on Image Processing, IEEE Signal Process. Soc. vol. 1 Austin, TX, USA, 13-16 vol. 1994, pp. 583-587. cited by other .
Bolot, Jean-Chrysostome, et al., "Scalable Feedback Control for Multicast Video Distribution in the Internet," SIGCOMM 94 -Aug. 1994 London, England UK, copyright 1994 ACM, pp. 58-67. cited by other .
Moura, Jose M.F., et al., "Retrieving quality video across heterogeneous networks--Video over Wireless," IEEE Personal Communications, Feb. 1996, pp. 44-54. cited by other .
Birney, Keith A., et al., "On the Modeling of DCT and Subband Image Data for Compression," IEEE Transactions on Image Processing, vol. 4, No. 2, Feb. 1995, pp. 186-193. cited by other .
Crutcher, Laurence, "The Networked Video Jukebox," IEEE Trans. Circuits Syst. Video Technol. (USA), vol. 4, No. 2, pp. 105-120. cited by other .
Neogi, Raja, "Embedded Real-Time Video Decompression Algorithm and Architecture for HDTV Applications," ICAPP 95, IEEE First ICA/sub 3/PP IEEE 1st Int'l. Conference on Algorithms & Architectures for Parallel Processing (95TH0682-5), pp. 414-421, vol. 1, 1995. cited by other .
"Quadtree Based Adaptive Lossy Coding of Motion Vectors". cited by other .
"A Frame-work for Live Multicast of Video Streams over the Internet". cite- d by other .
"Predictive Hierarchical Table-Lookup Vector Quantization with Quadtree Encoding". cited by other .
Yavatkar et al., "Optimistic strategies for large-scale dissemination of multimedia information," Proceeding of the conference on Multimedia '93, 1993, pp. 13-20. cited by other .
Amir, Elan, et al., "An Application Level Video Gateway," ACM Multimedia, Nov., 1995, pp. 1-10. cited by other .
McCanne, Steven, et al., "vic: A Flexible Framework for Packet Video," ACM Multimedia, Nov. 1995 pp. 1-12. cited by other .
Chaddha, N., et al., "An end to end software only scalable video delivery systems", Proceedings Networks and Operating System Support for Digital Audio and Video, pp. 130-141, Apr. 21, 1995. cited by other.

Primary Examiner: Philippe; Gims
Attorney, Agent or Firm: Microsoft Corporation

Parent Case Text



RELATED APPLICATIONS

This is a continuation of U.S. patent application Ser. No. 08/888,422, filed Jul. 7, 1997, now U.S. Pat. No. 6,392,705, which is a divisional of U.S. patent application Ser. No. 08/714,447, filed Mar. 17, 1997, now abandoned.
Claims



The invention claimed is:

1. A method used in compressing data, the method comprising: converting data into a series of data vectors; making a prediction of a current data vector based at least in part on one previous data vector; segmenting the current data vector into a plurality of sub-vectors; and mapping, using a hierarchical lookup table comprising a plurality of lookup tables, the plurality of sub-vectors to a set of codes by successive utilization of the plurality of lookup tables in stages so that one of the codes is generated in response to each of the sub-vectors wherein the set of codes comprise codes of different lengths.

2. A method used in compressing data, the method comprising: converting data into a series of data vectors; making a prediction of a current data vector based at least in part on one previous data vector; segmenting the current data vector into a plurality of sub-vectors; and mapping, using a hierarchical lookup table comprising a plurality of lookup tables, the plurality of sub-vectors to a set of codes by successive utilization of the plurality of lookup tables in stages so that one of the codes is generated in response to each of the sub-vectors wherein the set of codes comprise non-embedded codes.

3. A method used in compressing data the method comprising: converting data into a series of data vectors; making a prediction of a current data vector based at least in part on one previous data vector; and segmenting the current data vector into a plurality of sub-vectors wherein the plurality of sub-vectors comprise at least two sub-vectors of different sizes.

4. A data compression system comprising: a predictor configured to make a prediction of a current data vector based on at least one previous data vector; and a segmentor coupled to receive the prediction and configured to segment the current data vector into a plurality of sub-vectors based on the prediction, a hierarchical lookup table comprising a plurality of lookup tables, the hierarchical lookup table configured to map the plurality of sub-vectors to a set of codes by successive utilization of the plurality of lookup tables in stages so that one of the codes is generated in response to each of the sub-vectors, the hierarchical lookup table being coupled to the segmentor for receiving the plurality of sub-vectors wherein the set of codes comprise codes of different lengths.

5. A data compression system comprising: a predictor configured to make a prediction of a current data vector based on at least one previous data vector; and a segmentor coupled to receive the prediction and configured to segment the current data vector into a plurality of sub-vectors based on the prediction a hierarchical lookup table comprising a plurality of lookup tables, the hierarchical lookup table configured to map the plurality of sub-vectors to a set of codes by successive utilization of the plurality of lookup tables in stages so that one of the codes is generated in response to each of the sub-vectors, the hierarchical lookup table being coupled to the segmentor for receiving the plurality of sub-vectors wherein the set of codes comprise non-embedded codes.

6. A data compression system comprising: a predictor configured to make a prediction of a current data vector based on at least one previous data vector; and a segmentor coupled to receive the prediction and configured to segment the current data vector into a plurality of sub-vectors based on the prediction wherein the plurality of sub-vectors comprise at least two sub-vectors of different sizes.

7. One or more computer readable media having stored thereon a data compression program including a plurality of instructions that, when executed by one or more processors, causes the one or more processors to: make a prediction of a current data vector based on at least one previous data vector; and segment the current data vector into a plurality of sub-vectors based at least in part on the prediction, map, using a hierarchical lookup table comprising a plurality of lookup tables, the plurality of sub-vectors to a set of codes by successive utilization of the plurality of lookup tables in stages so that one of the codes is generated in response to each of the sub-vectors wherein the set of codes comprise codes of different lengths.

8. One or more computer readable media having stored thereon a data compression program including a plurality of instructions that, when executed by one or more processors, causes the one or more processors to: make a prediction of a current data vector based on at least one previous data vector; and segment the current data vector into a plurality of sub-vectors based at least in part on the prediction, map, using a hierarchical lookup table comprising a plurality of lookup tables, the plurality of sub-vectors to a set of codes by successive utilization of the plurality of lookup tables in stages so that one of the codes is generated in response to each of the sub-vectors wherein the set of codes comprise non-embedded codes.

9. One or more computer readable media having stored thereon a data compression program including a plurality of instructions that, when executed by one or more processors, causes the one or more processors to: make a prediction of a current data vector based on at least one previous data vector; and segment the current data vector into a plurality of sub-vectors based at least in part on the prediction wherein the plurality of sub-vectors comprise at least two sub-vectors of different sizes.
Description



BACKGROUND OF THE INVENTION

The present invention relates to multimedia data processing. More particularly, it relates to the compression and network delivery of scaleably formatted multimedia data, for example, still and video images, speech, and music. A major objective of the present invention is to enhance streaming multimedia applications over heterogeneous networks. In a streaming multimedia application, multimedia data is packetized, delivered over a network, and played as the packets are being received at the receiving end, as opposed to being played only after all packets have been downloaded.

As computers are becoming vehicles of human interaction, the demand is rising for the interaction to be more immediate and complete. The effort is now on to provide such data intensive services as multicast video, on-demand video, and video collaboration, e.g., video conferencing and interactive video. These services are provided across networks.

The computer networks of today and of the foreseeable future are heterogeneous. This means that the computers on the network possess varying computational power, e.g., 40 MHz Intel 486 CPU or 150 MHz Intel Pentium CPU, on-chip media processing or none. It also means that the connections of the network can be of varying topologies, e.g., ATM, Ethernet, ISDN, POTS ("plain old telephone system"), or wireless, possessing varying bandwidth capacities.

Multimedia data consists of different kinds of data, including video images and audio signals. For each kind of multimedia data, a certain number of characteristics can be used to describe that data. For example, resolution (the amount of detail in the image) and quality (the fidelity of the image being displayed to the original image) can be used to describe still images; resolution, quality, and frame rate (the rate at which images change) can be used to describe video; and resolution (audio samples per second) and quality (the fidelity of the sample being played to the original sample) can be used to describe audio. These are not the only sets of characteristics which can be used to describe these different multimedia data types.

Multimedia is experienced by playing it. The enjoyability of multimedia playback, and therefore the usefulness, depends, in large part, upon the particular characteristics of the multimedia data. The more of a positive characteristic that the multimedia data possesses, the greater the enjoyment in playback of that data. With video, for example, playback is generally superior the higher the resolution, the quality, and the frame rate.

Multimedia data consumes space. The amount of space that data consumes depends upon the degree to which the multimedia possesses certain characteristics. With video, for example, the higher the resolution, the quality, and the frame rate, the more data is required to describe the video data. Thus, greater enjoyment of multimedia comes at the cost of greater data requirements.

Networks introduce a temporal element to data. Networks transmit data across the network connections over time. The amount of data that a network connection can transmit in a certain amount of time is the bandwidth of that connection.

The bandwidth required to transmit multimedia data over a network is a function of the characteristics of that data. With video, for example, the higher the resolution, the quality, and the frame rate, the higher the bandwidth required to transmit that video. Once the level of resolution, quality, and frame rate of video content is known, the bandwidth required to transmit that content can be calculated.

Often, bandwidth is the initial constraining factor in transmitting multimedia data. That is, the available bandwidth of a particular network connection is known. With bandwidth known, the level of the characteristics of multimedia data can, in theory, be adjusted to ensure that the data can be transmitted over the network. With video, for example, if bandwidth is known, the frame rate, resolution, and quality of that video can each, in theory, be raised or lowered to ensure the video can be transmitted over the bandwidth.

Networks transmit data across network connections to computers and other devices on the network. After multimedia data reaches a computer of the network, that computer can attempt to playback the data. In playing back that data, demands are placed upon the computational power of the computer. In general, the higher the level of characteristics of certain multimedia data, the more computational power required to playback that data. With video, for example, the higher the resolution, the higher the quality, and the higher the frame rate, the greater the computational power required to playback the video.

Often, computational power is the initial constraining factor in playing back multimedia data. That is, the available computational power of a particular computer is known. With computational power known, the level of the characteristics of multimedia data can, in theory, be adjusted to ensure that the data can be played back by that computer. With video, for example, if available computational power is known, the frame rate, resolution, and quality of that video can each, in theory, be raised or lowered to ensure the video can be played back on that computer.

In a heterogeneous network, differential bandwidth and computational power constraints preclude all network participants from experiencing the best possible multimedia data playback. In a "lowest common denominator" approach, multimedia data which can be processed by the network participant with the lowest bandwidth and computational power capabilities would be generated and delivered not only to that participant, but to all network participants. This is undesirable, however, because the network participants with greater bandwidth and computational power capabilities will receive sub-optimal data.

Alternative approaches, e.g., MPEG-1, generate separate data files, with different characteristic levels (e.g., resolution, frame rate, quality) targeted for different bandwidth/computational power capabilities. Each network participant receives near optimal multimedia data given that participant's bandwidth and computational power. This is undesirable, however, because multiple data files for each multimedia presentation consume a great deal of storage space. For systems which store many multimedia presentations, this approach quickly becomes infeasible.

The drawback of the multiple file approach is particularly apparent in the multicast case. With standard multicast, one data file or stream is transmitted by the server down a particular channel, and participants who subscribe to that channel receive that data. This minimizes the use of bandwidth because only one copy of the data traverses the network. If multiple redundant files or streams are used to transmit multimedia data down multiple multicast channels, bandwidth will be wasted, contrary to the very purpose of multicast.

Still other approaches provide limited scalability in a single file or stream approach. For example, in the case of video, quality scalability may be provided, but not frame rate or resolution scalability. These approaches are sub-optimal in that network participants do not have full flexibility is tailoring the video presentation to their needs and desires.

In addition, where a compression system provides for scalability, complexity is often introduced which compels modifications to existing compression techniques. In particular, introducing frame rate scalability compels modifications to inter-frame compression techniques. One such inter-frame compression technique is conditional replenishment ("CR"). Another is motion compensation ("MC").

CR is an inter-frame video compression technique well known in the art. CR, like all inter-frame compression techniques, is used to achieve higher compression ratios. CR operates upon blocks of adjacent frames. A block is a contiguous subset of a frame. CR determines whether a block in the current frame should be encoded. "Forward" CR makes this determination by comparing the current block against the similarly positioned block in a previous frame. On the "condition" that the difference between the two blocks is less than some predetermined threshold value, the current block is not encoded; Instead it is "replenished" from the previous block. "Reverse" CR compares the current block against the corresponding block in a subsequent frame.

MC is another inter-frame compression technique well known in the art. MC can be considered a more general case of CR. Whereas forward CR compares the current block against only the corresponding block in a previous frame, forward MC compares the current block against more than one comparably sized blocks in that previous frame. If a matching block is found in the previous frame, MC generates a vector indicating the direction and distance describing the motion of the previous block, and error data describing changes in the previous block. MC can also operate at the frame level, as opposed to block level. As with reverse CR, reverse MC involves analyzing the current frame against a subsequent frame.

SUMMARY OF THE INVENTION

The present invention provides for compression and multicast network delivery of multimedia data in a format scaleable across one or more of the characteristics of that data. There are several aspects of the invention that are brought together to achieve optimal benefits, but which can be used separately.

One aspect of the present invention is a compression system for generating frame rate scaleable data for video. Frame rate scaleable data is scaleable across at least frame rate, and possibly additional video characteristics.

The present invention includes a compression system for generating universally scaleable data. Multimedia data can be described by a certain set of characteristics. Universally scaleable data is scaleable across that entire set. In the case of video, universally scaleable data is scaleable across frame rate, resolution, and quality. In the case of audio data, universally scaleable data is scaleable across resolution and quality. Universal scalability allows for flexible bandwidth and computational scalability. This makes universal scalability particularly useful for streaming multimedia applications over heterogeneous networks.

Another aspect of the present invention is a means for generating frame rate scaleable data which includes inter-frame compression. This results in an embedded inter-frame compression technique which achieves higher compression in addition to achieving frame rate scalability.

A final aspect of the present invention is multicast network delivery of frame rate scaleable data and universally scaleable data. Specifically, various layers of the scaleable data are divided and delivered over multiple multicast channels. Depending upon the bandwidth and computational power available to a network participant, that participant can subscribe to one or more multicast channels, thus optimizing the multimedia presentation to that participant.

It is not necessary for all aspects of the invention to be practiced together to attain advantages. However, when combined to yield a multimedia data compression and network delivery system, the result is optimally suited for streaming multimedia applications over heterogeneous networks. The multimedia data generated, delivered, and consumed by the compression and network delivery system is universally scaleable. In the case of video data, this means scalability in terms of frame rate, resolution and quality, and also in terms of bandwidth and computational power. The scaling can occur at any point in the system, including at the generation of the multimedia data, in transmission of the data over a network, or during playback of the data. Universal scalability allows for efficient and non-redundant storage of multimedia data, optimized and non-redundant multicast delivery of that data, and tradeoff among the multimedia characteristics (e.g., frame rate, resolution, and quality), and therefore bandwidth and computational power, to achieve an optimal combination of data playback, network traffic, and computational load. For example, bandwidth consumption of streamed video is significant concern for some network administrators. Universal scalability allows these network administrators to restrict bandwidth usage of video streams to a certain limit. With this limit set, the data delivered can be optimized for frame rate, resolution, and quality, given the computational power of the recipient's computer. These and other features and advantages of the invention are apparent from the description below with reference to the following drawings.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a schematic illustration of a compression and network delivery system in accordance with the present invention.

FIG. 2 is a schematic illustration of universally scaleable video data in accordance with the present invention.

FIG. 3 is a schematic illustration of a compression system in accordance with the present invention.

FIG. 4 is a schematic illustration of an embedded inter-frame compression system for achieving frame rate scalability and additional compression in accordance with the present invention.

FIG. 5 is a schematic illustration of a multicast network delivery system in accordance with the present invention.

FIG. 6 is a schematic illustration of the Laplacian pyramid decomposition algorithm.

FIG. 7 is a schematic illustration of the Laplacian pyramid composition algorithm.

FIG. 8 is a schematic illustration of the hierarchical vector quantization encoding algorithm.

FIG. 9 is a schematic illustration of the embedded CR used in an experimental prototype in accordance with the present invention.

FIG. 10 is a schematic illustration of arrangement of packets in accordance with the present invention.

DESCRIPTION OF THE PREFERRED EMBODIMENTS

Referring now to FIG. 1, a Compression System 1 and Network Delivery System 4 are provided in accordance with the present invention. Compression System 1 comprises Encoder 2 and Decoder 3. Network Delivery System 4 comprises Network Delivery Sender 5, Network 6, and Network Delivery Receiver 7.

Encoder 2 accepts as input Original Multimedia Data 8 and generates as output Formatted Encoded Data 9. Original Multimedia Data 8 comprises multimedia data, including still image, video, or audio. Original Multimedia Data 8 is in a standard uncompressed digital format. With still image or video, for example, the data could be in the YUV 4:2:2 format. Original Multimedia Data 8 can exist in the form of a static file, or as a dynamic stream. Formatted Encoded Data 9 is a compressed representation of Original Multimedia Data 8, and it is formatted for optimized delivery over heterogeneous networks.

Network Delivery Sender 5 accepts as input Formatted Encoded Data 9, and generates as output Streamed Formatted Encoded Data 10. Network Delivery Sender 5 streams Streamed Formatted Encoded Data 10 over Network 6. Network Delivery Sender 5 can stream this data for any purpose, including video-on-demand, multicast, and video-conferencing.

Network 6 transports Streamed Formatted Encoded Data 10 to Network Delivery Receiver 7. Network Delivery Receiver 7 generates as output Processed Formatted Encoded Data 11. Processed Formatted Encoded Data 11 comprises either all of Streamed Formatted Encoded Data 10, or a subset of that data.

Decoder 3 accepts as input Processed Formatted Encoded Data 11 and generates as output Decoded Multimedia Data 12. Decoded Multimedia Data 12 is a post-compression/decompression representation of Original Multimedia Data 8. Decoded Multimedia Data 12 is delivered to Playback Device 13 which plays the Decoded Multimedia Data 12. In the case of still image content, this would comprise displaying the image. In the case of audio content, this would comprise playing the audio content. In the case of video content, this would comprise playing the video content.

In general, the different components described above can be distributed across different hardware units.

The compression and network delivery system of FIG. 1 generates and processes universally scaleable data, including frame rate scaleable data. The data can be of any multimedia type, including still images, video, and audio. Universal scalability means that the multimedia data is scaleable across all relevant characteristics of the data.

FIG. 2 is a hierarchical depiction of universally scaleable data for video. Formatted Encoded Data 9, Streamed Formatted Encoded Data 10, and Processed Formatted Encoded Data 11 in FIG. 1 could be video data in the format depicted in FIG. 2. FIG. 2.a depicts the frame rate scaleable aspect of the universally scaleable data. FIG. 2.b depicts the resolution scaleable aspect of that data. FIG. 2.c depicts the quality scaleable aspect of that data.

FIG. 2.a depicts Frames F1 F9. Frames F1 F9 are nine sequential image frames from the video data Frame F2 directly succeeds Frame F1, Frame F3 directly succeeds Frame F2, and so on. In accordance with the present invention, Frames F1 F9 are rearranged into Temporal Layers T1, T2, and T3. Temporal Layer T1 comprises every fourth frame, namely Frames F1, F5, and F9. Temporal Layer T2 comprises Frames F3, and F7. Temporal Layer T3 comprises Frames F2, F4, F6, and F8.

The temporal layers are additive. This means that layers are combined "first-to-last" to achieve successively higher frame rates. Thus, the first temporal layer achieves a certain frame rate. A second, higher frame rate, is achieved by combining the second temporal layer with the first. A third, and higher still frame rate is achieved by combining the third temporal layer with the first and second, and so on. Conversely, the additive layers are dropped "last-to-first". If the n-th temporal layer has been dropped, then temporal layers (n+1), (n+2), etc., have been dropped.

Referring now to FIG. 2, Temporal Layer T1, standing alone, provides for Frame Rate FR1. Temporal Layer T1 combined with Temporal Layer T2 comprises every second frame, namely Frames F1, F3, F5, F7, and F9, and provides for Frame Rate FR2 which is higher than Frame Rate FR1. Temporal Layer T1 combined with Temporal Layers T2 and T3 comprises every frame, namely Frames F1 F9, and provides for Frame Rate FR3 which is higher than Frame Rate FR2.

At any point in the process of encoding, delivering, and playing back the video frames, temporal layers can be dropped to achieve a desired frame rate. At the encoding stage, if Frame Rate FR2 is desired, for example, only Temporal Layers T1 and T2 need be generated. Temporal Layer T3 is not generated. Similarly, if the video content was encoded to provide for Frame Rate FR3, and Frame Rate FR1 is desired, Temporal Layers T2 and T3 can be dropped during network delivery or playback.

In general, there can be any number of temporal layers and any number of frames in each temporal layer. In the preferred embodiment, there are four temporal layers. The first temporal layers comprises every ninth frame beginning with the first; the second layer comprises every ninth frame beginning with the fifth; the third layer comprises the remaining odd-numbered frames; and the fourth layer comprises the even-numbered frames. The first temporal layer provides for a frame rate of 3.75 frame per second ("fps"); the first and second layers combined correspond to a frame rate of 7.5 fps; the first, second, and third layers combined correspond to a frame rate of 15 fps; and all four layers combined correspond to a frame rate of 30 fps.

FIG. 2.b depicts one representative frame from FIG. 2.a, in this case, Frame F1. Frame F1 comprises Base Layer B1, Enhancement Layer E1, and Enhancement Layer E2.

As with the temporal layers of FIG. 2.a, the base and enhancement layers of FIG. 2.b are additive (combining layers first-to-last achieves successively higher resolutions). Base Layer B1, the first layer, provides for Resolution R1, the smallest, or base resolution for Frame F1. Resolution R1 comprises the smallest number of picture elements ("pixels") of any resolution for Frame F1. Enhancement Layer E1 comprises certain data which, when combined with Base Layer B1 in a particular way, provides for Resolution R2. Resolution R2 is higher than Resolution R1. Similarly, Resolution R3, which is higher than both Resolution R1 and Resolution R2. Resolution R3 is obtained by combining Base Layer B1 with Enhancement Layers E1 and E2 in a particular way.

Enhancement Layers E1 and E2 typically comprise error or difference data. The error or difference data is generated to take advantage of the redundancy across different resolutions. Enhancement Layer E1 is obtained by subtracting an up-sampled and filtered version of R1 from R2. Up-sampling and filtering are techniques well known in the art.

At any point in the process of encoding, delivering, and playing back the video frames, layers can be dropped to achieve a desired resolution. At the encoding stage, if Resolution R2 is desired, only Base Layer B1 and Enhancement Layer E1 are generated. Enhancement Layer E2 is not generated. Similarly, if the video content was encoded to provide for Resolution R3, and Resolution R1 is desired, Enhancement Layers E1 and E2 can be dropped during network delivery or playback.

In general, there can be any number of base and enhancement layers. In the preferred embodiment, there is one base layer and two enhancement layers. The resolution of the base layer is 160.times.120 pixels; the second resolution level is 320.times.240 pixels; and the third resolution level is 640.times.480 pixels.

FIG. 2.c depicts one representative layer from FIG. 2.b, in this case, Base Layer B1. Base Layer B1 comprises Index Planes P1 P5. Index Plane P1, the first plane, provides for Quality Q1, the lowest quality for Base Layer B1. Index Plane P1 is an encoded representation of Base Layer B1. For example, Index Plane P1 could comprise a series of codes. Each of these codes would represent one part of Base Layer B1.

The number of bits in a code determines how much information that code can represent. The more bits in a code, the more information that code can represent. Additionally, the more bits in a code, the more accurate can be the code's representation of what it encodes.

Index Plane P1 represents Base Layer B1 with Quality Q1. Quality Q1 is the fidelity that the decoded representation of Index Plane P1 bears toward the original Frame F1. Quality is measured typically by Signal-to-Noise-Ratio (SNR). If Index Plane P1 comprises a series of codes n bits long, then Quality Q1 is the level of quality afforded by n-bit encoding.

As with the temporal layers of FIG. 2.a, and the base and enhancement layers of FIG. 2.b, the index planes are additive (combining index planes first-to-last achieves successively higher quality). Index Plane P2, when combined with Index Plane P1, results in an encoding of Base Layer B1 with Quality Q2. If Index Plane P1 comprises a series of codes n bits long, and Index Plane P2 comprises a series of codes m bits long, then combining the two index planes will result in a series of codes n+m bits in length. Because the n+m-bit codes are longer than the n-bit codes, Quality Q2 can and should be higher than Quality Q1.

Index Planes P3 P5 are similarly combined with all of the preceding index planes to generate codes of increasing length, and thereby increasing Quality Q3 Q5.

In general, there can be any number of index planes. In the preferred embodiment, there are three index planes. The initial index plane comprises a series of codes 6 bits in length; the second plane comprises a series of codes 3 bits in length; and the third plane also comprises a series of codes 3 bits in length. Each code corresponds to a particular block in the frame.

Collectively, the index planes correspond to a series of embedded codes. A code represents a thing. With an embedded code, a subset of the code will represent something else close that thing. For example, say an n-bit embedded code represents image I. Then the first (n-1) bits of that code will represent an image I' which is an approximation of I.

In the preferred embodiment, the embedded codes are fixed in length rather than variable in length. Because the compression system generates fixed-length codes rather than variable-length codes during generation of the index planes, the system represents a wide range of video data well, rather than representing a limited range very well and the remainder poorly. Higher compression ratios are achieved later in the system by run-length coding the fixed-length codes of the index planes.

FIG. 2 depicts universally scaleable video data. Universal scalability can be used for other kinds of multimedia data, including still images and audio. With still images, resolution and quality are characteristics for which the scaleable representations depicted in FIG. 2.b and FIG. 2.c, respectively, can be used. Similarly, with audio, the concept of frame rate is generally thought not to apply. However, sampling rate (resolution) and amplitude (quality) do apply. The scaleable representations depicted in FIG. 2.b and FIG. 2.c, respectively, can be used for these two characteristics.

FIG. 3 depicts a Compression System 1' from FIG. 1 which processes video data. Compression System 1' comprises Encoder 2' and Decoder 3'. Encoder 2' generates universally scaleable data, or frame rate scaleable video data. Decoder 3' processes this data.

Encoder 2' includes Frame Rate Scaleable Component 31, and optionally includes Resolution Scaleable Component 32 and Quality Scaleable Component 33. Frame Rate Scaleable Component 31 rearranges the frames of Original Video Data 8 into temporal layers of the frame rate scaleable data format represented in FIG. 2.a. If a particular maximum frame rate is desired, Frame Rate Scaleable Component 31 can drop temporal layers unnecessary to achieve that frame rate.

For an individual frame of Original Video Data 8, Resolution Scaleable Component 32 generates base and enhancement layers of the resolution scaleable data format represented in FIG. 2.b. In the preferred embodiment, the method of generating these layers is Laplacian pyramid decomposition, a technique well known in the art. Laplacian pyramid decomposition performs sequenced actions upon these layers, including filtering and down-sampling, up-sampling and filtering. If a particular maximum resolution is desired, Resolution Scaleable Component 32 can fail to generate enhancement layers unnecessary to achieve that resolution.

Other methods for generating the base and enhancement layers to achieve resolution scalability include wavelet and subband decompositions. These techniques are also well known in the art.

For an individual layer generated by Resolution Scaleable Component 32, Quality Scaleable Component 33 generates index planes of the quality scaleable data format represented in FIG. 2.c. If a particular maximum quality is desired, Quality Scaleable Component 33 can fail to generate index planes unnecessary to achieve that quality.

In the preferred embodiment, to generate a particular enhancement layer, all of the index planes for the preceding layers are generated. This is because the highest quality enhancement layers generated by the Laplacian pyramid decompositions are obtained by using the best quality representations of the preceding layers.

Quality Scaleable Component 33 encodes (compresses) pixel data. The result of this encoding is the generation of a collection of codes. This collection of codes comprise the index planes. The index planes are enabled by the fact that these code are embedded.

A number of different methods are known in the art for generating embedded codes. These include tree-structured scalar quantization, tree-structured vector quantization (TSVQ), and other progressive scalar and vector quantization coding schemes.

Frame Rate Scaleable Component 31, Resolution Scaleable Component 32, and Quality Scaleable Component 33 can operate in any sequence, including in an interleaved manner. For example, if the Resolution Scaleable Component 32 performs Laplacian pyramid encoding, the Quality Scaleable Component 33 will generate index planes in connection with the Laplacian pyramid decomposition.

Decoder 3' accepts as input Processed Formatted Encoded Video Data 11'. Decoder 3' includes Frame Rearranger 34, and optionally includes Resolution Reconstructor 35 and Index Plane Decoder 36. Frame Rearranger 34 rearranges the video frames into original sequential order. If a frame rate lower than that provided by Processed Formatted Encoded Video Data 11' is desired, temporal layers of frames can be dropped which are unnecessary for achieving the desired frame rate. This situation could arise, for example, when computational power is insufficient to operate Decoder 3' in real-time.

For each frame, Resolution Reconstructor 35 constructs an image of the resolution provided for in the data. Resolution Reconstructor 35 performs the complement to the action performed by Resolution Scaleable Component 32. For example, if Resolution Scaleable Component 32 performed Laplacian pyramid decomposition, Resolution Reconstructor 35 will perform Laplacian pyramid composition. If a resolution lower than that provided by Processed Formatted Encoded Video Data 11' is desired, enhancement layers can be dropped which are unnecessary for achieving the desired resolution.

Index Plane Decoder 36 decodes the codes which comprise the index planes. Index Plane Decoder 36 performs the decode complement to the encode performed by Quality Scaleable Component 33. For example, if Quality Scaleable Component 33 performed TSVQ encoding, Index Plane Decoder 36 will perform TSVQ decoding.

If quality lower than that provided by Processed Formatted Encoded Video Data 11' is desired, index planes can be dropped which are unnecessary for achieving the desired quality.

FIG. 4 depicts a technique for embedded inter-frame compression in accordance with the present invention. Embedded inter-frame compression is inter-frame compression combined with frame rate scalability. Frame rate scaleable data rearranges video frames into temporal layers of frames which are additive, and which can be dropped during network delivery or playback. Embedded inter-frame compression modifies conventional inter-frame compression to take into account these aspects of frame rate scaleable data.

FIG. 4 depicts Frames F1 F9. Frames F1 F9 are nine sequential image frames from the video content arranged into Temporal Layers T1, T2, and T3 in accordance with the present invention. Inter-frame compression is performed on all pairs of adjacent frames in Temporal Layer T1.

For example, Forward Inter-Frame Compression 41 is performed on Frame F5 against Frame F1. In the case of CR, this means that each block of Frame F5 is compared with the corresponding block of Frame F1. If the blocks are sufficiently close, a code, for example X2, is generated for that block of Frame F5 to indicate that the block should be replenished from the previous block. Otherwise, another code, for example X1, would be generated indicating that no replenishment is to occur. In that case, the block would be quantized and transmitted.

For the frames in Temporal Layers T2 and T3, both forward inter-frame compression and reverse inter-frame compression are optionally performed. Reverse inter-frame compression is the same as forward inter-frame compression, except that the current frame is compared against a subsequent frame, as opposed to a previous frame. In the case of CR, reverse CR compares blocks in the current frame against blocks in a subsequent frame, as opposed to a previous frame.

In the preferred embodiment, for each frame in the temporal layers beyond the initial layer, inter-frame compression is performed against the frame in any higher temporal layer which was previous to that frame in the original video, but closest in proximity. Reverse inter-frame compression is performed against that frame in the higher temporal layer which was subsequent to that frame in the original video, but closest. Performing inter-frame compression against only higher layers alleviates the problem of the network delivery or playback systems throwing away layers. In accordance with the present invention, temporal layers are discarded starting from the bottom layer and moving up.

For example, Forward Inter-Frame Compression 42 is performed on Frame F3 of Temporal Layer T2 against Frame F1 of Temporal Layer T1. Reverse Inter-Frame Compression 43 is performed on Frame F3 against Frame F5 of Temporal Layer T1. Similarly, Forward Inter-Frame Compression 44 is performed on Frame F2 of Temporal Layer T3 against Frame F1 of Temporal Layer T1. Reverse Inter-Frame Compression 45 is performed on Frame F2 against Frame F3 of Temporal Layer T2.

In the case of CR, for a block of a frame in Temporal Layers T2 or T3, a code, for example X3, is generated if forward CR failed but reverse CR succeeded. If forward CR succeeds, code X2 is generated. If both forward CR and reverse CR fail, code X1 is generated. In that case, the block is quantized and transmitted.

Free Web Sudoku Puzzles.
Solve with your browser.
    8   2   9    
  5   8          
9             1 7
6     1 9       8
  1           5  
7       5 6     2
4 9             3
          8   6  
    6   1   4    
What is it?



Add Your Site · Terms Of Service · Privacy Policy


DISCLAIMER
Linkgrinder is a free service that searches the Internet and indexes all files found so that you may search quickly and easily for shared files. These files are created and made available individually by users whose identity we are not aware of and who we have no control over. In essence we function like a search engine tool; these files ARE NOT STORED OR SERVED BY OUR NETWORK. We are not responsible for any materials obtained by using our service. We do not monitor any of the contents of these files. These files may contain viruses, illegal materials, materials inappropriate for minors, offensive files and the like. BY USING OUR SERVICE, YOU ASSUME FULL RESPONSIBILITY FOR DOWNLOADING THESE MATERIALS AND WILL INDEMNIFY US FOR ANY DAMAGES THAT MAY BE INCURRED.

For More Specific Information VIEW OUR TERMS OF SERVICE.

Thank you and Enjoy!