Senior Fitness - Exercise and Nutrition for Aging Men and Women
FREE Article Feed for your website.
Home Ownership Magazine
Party Planning Information
Article Marketing Resources
Bio-Medical Research Article Database
Informative Articles on Life, Love and Happiness
Tutorials on Business to Writing
Famous Quotes from Famous People
Song Lyric Information
New US Patent Information
Comprehensive List of Content by Category
Online Auctions and Shopping Related Articles
Article Search
Most Recent Articles
 

Liposuction Risks and Benefits
Category:
Health / Fitness  

3 Steps To Getting An Online High School Education
Category:
Education  

PROBLOGGING MAKING MONEY FROM BLOGS
Category:
Marketing  

The Pug Dignified Clown with a Lot of Attitude
Category:
Home And Family  

Looking For A Home Equity Loan
Category:
Finance / Investment  

Leukemia 101 what you need to know about
Category:
Health / Fitness  

Cool Tips on How to Save on Home Improvement
Category:
Business  

My Google Adsense
Category:
Marketing  

Want To Snowboard Find Out Where To Start
Category:
Sports  

Jenny Craig Diet
Category:
Health / Fitness  

Quit Smoking Detox What To Expect by Nguang Nguek Fluek
Category:
Health / Fitness  

Make a Fortune Online http giggity payitforward4profits com
Category:
Marketing  

Mapping your way to a good html site map
Category:
Marketing  

You Can Reduce Your Risk of Breast Cancer
Category:
Health / Fitness  

Winning A Losing Battle In Online Business
Category:
Business  

Bread made easy
Category:
Food / Drink  

7 Secrets for Increasing Internet Banner Ad Click Through Rates
Category:
Marketing  

Why CAN T We Have It All
Category:
Self Help  

Sleep Apnea Treatment Tips to Ensure Sound Sleep
Category:
Health / Fitness  

Timeshare Let us Give What the Kids Want
Category:
Marketing  

Premenstrual Syndrome Plagued with Premenstrual Syndrome Try Cal...
Category:
Health / Fitness  

After school safety
Category:
Education  

What Causes Prostate Cancer The True Environmental Factors Behin...
Category:
Health / Fitness  

Free And Unique Articles Directory for Professional Writers
Category:
Computers  

Can You Avoid Diabetic Neuropathy
Category:
Home And Family  

AMERICAN AND FRENCH CAR MANUFACTURERS PURGING WITHOUT IMAGINATIO...
Category:
Cars And Trucks  

What Exactly is a Hookah
Category:
Home And Family  

Searching For Cheap Drug And Alcohol Rehab Centers
Category:
Health / Fitness  

Planning a Maui Wedding
Category:
Home And Family  

Infrared Sauna Kits
Category:
Health / Fitness  

How To Get FREE eZine Advertising And Be Seen As An Expert At Th...
Category:
Marketing  

Colored Motorcycle Lever Covers
Category:
Cars And Trucks  

Auto insurance explained
Category:
Finance / Investment  

Home Skin Care Tips Emily s Smart Move
Category:
Health / Fitness  

Vacuum Configurations
Category:
Health / Fitness  

7 Simple And Effective Strategies Of Forum Posting
Category:
Marketing  

Play Blackjack Like a Pro
Category:
Hobbies / Pastimes  

Buying a Discount Baseball Glove
Category:
Sports  

The Benefits of Beta Carotene
Category:
Health / Fitness  

Stomping authority of women acknowledged by e tailers
Category:
Business  

The Complete Guide To Breast Enhancement Products
Category:
Health / Fitness  

Ticket Tips on how to buy tickets online direct from the box off...
Category:
Entertainment / Television  

Small Kitchen Appliances As Seen On TV
Category:
Food / Drink  

4 Simple Steps To Massive Online Success
Category:
Business  

Bathroom Remodeling You Can Do That
Category:
Home And Family  

Futons and Your Bed Buying Journey
Category:
Home And Family  

Raised Bed Gardening
Category:
Home And Family  

25 Leadership Maxims
Category:
Business  

Local Government and Politics Meetings are the Challenge
Category:
Education  

Summer Sizzle Brings Blackouts Sags and Spikes to Your Computer ...
Category:
Computers  

The Oakley RAZRWIRE Sunglasses
Category:
Home And Family  

Marketing Automation 101
Category:
Marketing  

MARKETING VITAL HEALTH CARE PRODUCTS
Category:
Marketing  

Senior High Home Schooling Help My Child is Smarter than Me
Category:
Education  

Guide To Car Loan Calculators
Category:
Finance / Investment  

Income For Life
Category:
Business  

Sports Arbitrage Trading Reviewed
Category:
Marketing  

Starting a stamp collection
Category:
Home And Family  

Economic Sacred Cows Are Out of Milk
Category:
Education  

A little Vintage Computer Monitor History
Category:
Computers  

Recycle Your Work
Category:
Writing  

Chinese astrology
Category:
Entertainment / Television  

Upgrades Fuel VoIP Enterprise Market As Companies Replace Aging ...
Category:
Computers  

GYXU com Talking about sports
Category:
Computers  

An Honest Review Of Perry Marshalls The Definitive Guide To Goog...
Category:
Marketing  

Samoa A South Pacific Paradise
Category:
Travel  

Trust for May and DailyAffirm Positive Affirmations Day by Day
Category:
Self Help  

Brainwaves Part 1 Frequencies
Category:
Health / Fitness  

Student Loan Consolidation Rates Set to Increase on July 1
Category:
Finance / Investment  

gyxe com Vital information shared
Category:
Computers  

Custom Lasik Surgery Today
Category:
Health / Fitness  

The Fun In Using Online Coupons
Category:
Marketing  

How to Buy a Rugged Laptop Case
Category:
Computers  

Is Eating The Only Way To Gain Weight
Category:
Health / Fitness  

Finding Stuff Out With Jyve Knowledge Base
Category:
Computers

Decoder for decoding segment-based encoding of video data using segmentation performed at a decoder Number:7,082,166 from the United States Patent and Trademark Office (PTO) owispatent

Home    Author Login    Submit Article    Article Search    Add Your Link    Edit Your Link    Contact Us    Advertising    Disclaimer

   

 
Web LinkGrinder.com

Top Breaking News
     US Urges Immediate Ceasefire in Georgia Conflict by David Gollust
     Report Warns of Oil Supply Crunch by Tendai Maphosa
     

Title: Decoder for decoding segment-based encoding of video data using segmentation performed at a decoder

Abstract: A decoder decodes compressed video data wherein nonkey frames are decoded with reference to other frames from the video data that are reference frames. The decoder generates at least a part of a segmentation of the reference frames for use in decoding nonkey frames. A nonkey frame is regenerated using kinetic information about the current frame and the reference frame segmentation. Kinetic information might include segment translation information. Where the segmentation used in encoding the compressed video data can vary among a plurality of segmentation schemes, the decoder determines which segmentation scheme is used from selection indications in the compressed video data or from previously decoded video data. The decoder might also use partial segmentation information, segmentation hints, partial segment canonical information and/or canonical hints in its segmentation process. The decoder might also process segment-related metadata extracted the compressed video data.

Patent Number: 7,082,166 Issued on 07/25/2006 to Prakash,   et al.


Inventors: Prakash; Adityo (Redwood Shores, CA); Fodor; Eniko (Redwood Shores, CA)
Assignee: PTS Corporation (San Jose, CA)
Appl. No.: 10/105,055
Filed: March 20, 2002


Related U.S. Patent Documents

Application NumberFiling DatePatent NumberIssue Date
09550705Apr., 20006600786

Current U.S. Class: 375/240.25
Current International Class: H04N 7/12 (20060101)
Field of Search: 375/240.25,240.08,240.12,240.16,240.19,240.2,240.26,240.28,240.29 382/173,236,238,241,242,243,248-250


References Cited [Referenced By]

U.S. Patent Documents
5278647 January 1994 Hingorani et al.
5351085 September 1994 Coelho et al.
5448297 September 1995 Alattar et al.
5485279 January 1996 Yonemitsu et al.
5491513 February 1996 Wickstrom et al.
5654760 August 1997 Ohtsuki
5657086 August 1997 Tahara et al.
5703646 December 1997 Oda
5719986 February 1998 Kato et al.
5812791 September 1998 Wasserman et al.
5847767 December 1998 Ueda
5867221 February 1999 Pullen et al.
5926572 July 1999 Kim et al.
5982441 November 1999 Hurd et al.
6026182 February 2000 Lee et al.
6055330 April 2000 Eleftheriadis et al.
6057884 May 2000 Chen et al.
6185363 February 2001 Dimitrova et al.
6693964 February 2004 Zhang et al.
6735253 May 2004 Chang et al.
6829648 December 2004 Jones et al.
Primary Examiner: Le; Vu
Assistant Examiner: Cathey, II; Patrick H.
Attorney, Agent or Firm: Okamoto & Benedicto LLP

Parent Case Text



CROSS-REFERENCES TO RELATED APPLICATIONS

This application is a continuation-in-part of U.S. application Ser. No. 09/550,705, filed on Apr. 17, 2000 now U.S. Pat. No. 6,600,786, the complete disclosure of which is incorporated herein by reference for all purposes.
Claims



What is claimed is:

1. A decoder for decoding compressed video data to reconstruct video data that is at least an approximation of uncompressed video data encoded as the compressed video data, wherein the video data comprises a sequence of a plurality of image frames comprising reference frames and non-reference frames, the decoder comprising: a frame decompressor configured to decode a compressed reference frame to form a decoded reference frame: a frame buffer configured to store the decoded reference frame; and a segmenter in the decoder operating separately from the frame decompressor and configured to segment the decoded reference frame, wherein the segmentation is an assignment of some or all of the pixels of the decoded reference frame to segments based on at least one of pixel color values of the pixels and location of the pixels in the decoded reference frame; and a nonkey frame generator that regenerates, when a current frame is a non-reference frame, all or part of the current frame using kinetic information about the current frame from the compressed video data and using at least a part of the segmentation of the reference frame generated by the segmenter.

2. The decoder of claim 1, wherein the decoder is adapted to process kinetic information that includes segment translation relating the pixels of a segment in the reference frame to pixels in a translated position in the current frame.

3. The decoder of claim 1, wherein segmenter is adapted to generate a segmentation for a given reference frame that is exactly the same as a segmentation done for the encoding of the given reference frame.

4. The decoder of claim 1, wherein segmenter is adapted to generate a segmentation for a given reference frame that is not necessarily exactly the same as a segmentation done for the encoding of the given reference frame.

5. The decoder of claim 1, wherein the compressed video data encodes at least one reference frame such that the encoding of the frame is not lossless and wherein the segmenter generates a segmentation of a reconstruction of a compressed version of the reference frame that takes into account the losses of information due to compression of the reference frame.

6. The decoder of claim 1, wherein the decoder is adapted to decode other frames using segment-specific kinetic information of a decoded frame.

7. The decoder of claim 1, wherein the decoder includes logic to decompress video data using a plurality of segmentation schemes.

8. The decoder of claim 7, wherein the decoder includes logic to decompress video data based on indications of which of a plurality of segmentation schemes is used.

9. The decoder of claim 8, wherein the indications comprise frame-by-frame indications.

10. The decoder of claim 7, wherein the decoder includes logic to deduce which of a plurality of segmentation schemes was used for a given reference frame from other data extracted from the compressed video data.

11. The decoder of claim 1, wherein the decoder includes logic to decompress the compressed video data using partial segmentation information.

12. The decoder of claim 1, wherein the decoder includes logic to decompress the compressed video data using segmentation hints.

13. The decoder of claim 12, wherein the segmentation hints include indications of differences between a segmentation of a reference frame and a segmentation of a reconstructed nonlossless compression of the reference frame, thereby allowing the decoder to reconstruct and use a segmentation of the uncompressed reference frame.

14. The decoder of claim 12, wherein the decoder is adapted to use the segmentation hints to partially or fully synchronize the decoder segmentation to a segmentation used to compress the video data.

15. The decoder of claim 1, wherein the decoder includes logic to decompress the compressed video data using partial segment canonical information, wherein canonical information indicates an ordering of elements.

16. The decoder of claim 1, wherein the decoder can partially or fully synchronize a decoder canonicalization to a canonicalization used to compress the video data using canonical hints from the compressed video data, wherein canonicalization refers to an ordering of the segments.

17. The decoder of claim 1, wherein the reference frame is a key frame, decodable without reference to other frames.

18. The decoder of claim 1, wherein the reference frame is a nonkey frame.

19. The decoder of claim 1, wherein the decoder is adapted to process a reference frame and a nonkey frame regardless of their relative order in a video sequence.

20. The decoder of claim 1, wherein the decoder is adapted to process the compressed video data using segment-by-segment residue data from the compressed video data.

21. The decoder of claim 20, wherein the residue data is coded at least in part on segmentation information available to the decoder from previously coded frames.

22. The decoder of claim 1, wherein the compressed video data includes metadata associated with segments and the decoder is adapted to include the metadata in the decoder output stream such that it remains associated with segments.

23. A method of decoding compressed video data to reconstruct video data that is at least an approximation of uncompressed video data encoded as the compressed video data, wherein the video data comprises a sequence of a plurality of image frames comprising key frames and nonkey frames, the method comprising: decoding a compressed first key frame into a reconstructed first key frame, wherein the compressed first key frame is a frame encoded without requiring knowledge of contents of another frame; segmenting the reconstructed first key frame into a first segmentation, wherein the segmentation is an assignment of some or all of the pixels of the reconstructed first key frame to segments based on at least one of pixel color values of the pixels and location of the pixels in the reconstructed first key frame; and decoding a compressed first nonkey frame to a reconstructed first nonkey frame using the reconstructed first key frame as a reference frame, comprising: a) associating segments of the reference frame to pixels of the first nonkey frame based on kinetic information about the first nonkey frame from the compressed video data; and b) determining content of the reconstructed first nonkey frame from results of the step of associating and at least some residue data.

24. The method of claim 23, further comprising: segmenting the reconstructed first nonkey frame to form a second segmentation; decoding a second nonkey frame to a reconstructed second nonkey frame using the reconstructed first nonkey frame and the second segmentation.

25. The method of claim 23, further comprising decoding a second nonkey frame using the reconstructed first key frame and the first segmentation.

26. The method of claim 23, wherein determining content of the reconstructed first nonkey frame comprises: processing kinetic data associated with the compressed first nonkey frame; processing model data associated with the compressed first nonkey frame; and processing residue data associated with the compressed first nonkey frame, wherein the residue data represents differences between an unencoded first nonkey frame and a frame wherein the reference frame, the kinetic data and the model data have been applied.

27. The method of claim 23, wherein segmenting the reconstructed first key frame is performed according to one of a predetermined plurality of segmentation schemes.

28. The method of claim 27, wherein indications of which of the plurality of segmentation schemes is used are extracted from the compressed video data.

29. The method of claim 28, wherein the indications comprise frame-by-frame indications.

30. The method of claim 23, wherein segmenting the reconstructed first key frame is performed using partial segmentation information extracted from the compressed video data.

31. The method of claim 23, wherein segmenting the reconstructed first key frame is performed using segmentation hints extracted from the compressed video data.

32. The method of claim 31, wherein the segmentation hints include indications of differences between a segmentation of an uncompressed reference frame and a segmentation of a reconstructed nonlossless compression of the reference frame.

33. The method of claim 31, further comprising using the segmentation hints to partially or fully synchronize the decoder segmentation to a segmentation used to compress the video data.

34. The method of claim 23, further comprising using partial segment canonical information, wherein canonical information indicates an ordering of elements.

35. The method of claim 23, further comprising using canonical hints from the compressed video data to partially or fully synchronize a decoder canonicalization to a canonicalization used to compress the video data, wherein canonicalization refers to an ordering of the segments.
Description



BACKGROUND OF THE INVENTION

1. Brief Introduction

As more communication requires video, such as real-time streaming of video, video conferencing, digital television, interactive television and Internet-based communications such as hypertext transport of World Wide Web (WWW) content, more efficient ways of utilizing existing bandwidth are needed. This is because the typical bandwidth allocated to a particular transmission mode (e.g., broadcast, cable, telephone lines, etc.) is much less than the bandwidth typically required for a video stream. Thus, if such modes are to carry video, compression is needed. Compression is also needed where the video is stored, so that storage capacity is efficiently used. The advent of multi-media capabilities on most computer systems has taxed traditional storage devices, such as hard drives, to their limits.

Compression allows digitized video sequences to be represented efficiently, allowing more video to be transmitted in a given amount of time over a given channel, or more video to be stored in a given storage medium. Compression does this by reducing the bitstream, or video information flow, of the video sequences at a transmitter (which can be placing the bitstream into a channel or storing into a storage medium) while retaining enough information that a decoder or receiver at the other end of the channel or reading the storage medium can reconstruct the video in a manner adequate for the specific application, such as television, videoconferencing, etc.

Video is typically represented by a sequence of images, called "frames" or "video frames" that, when played in sequence, present the video. As used herein, a video stream might refer to a video and audio stream, where the audio is included with the video. However, for simplicity, just the video compression is often described.

As the terms are used herein, an image is data derived from a multi-dimensional signal. The signal might be originated or generated either naturally or artificially. This multi-dimensional signal (where the dimension could be one, two, three, or more) may be represented as an array of pixel color values such that pixels placed in an array and colored according to each pixel's color value would represent the image. Each pixel has a location and can be thought of as being a point at that location or as a shape that fills the area around the pixel such that any point within the image is considered to be "in" a pixel's area or considered to be part of the pixel. The image itself might be a multidimensional pixel array on a display, on a printed page, an array stored in memory, or a data signal being transmitted and representing the image. The multidimensional pixel array can be a two-dimensional array for a two-dimensional image, a three-dimensional array for a three-dimensional image, or some other number of dimensions.

The image can be an image of a physical space or plane or an image of a simulated and/or computer-generated space or plane. In the computer graphic arts, a common image is a two-dimensional view of a computer-generated three-dimensional space (such as a geometric model of objects and light sources in a three-space). An image can be a single image or one of a plurality of images that, when arranged in a suitable time order, form a moving image, herein referred to as a video sequence.

Pixel color values can be selected from any number of pixel color spaces. One color space in common use is known as the YUV color space, wherein a pixel color value is described by the triple (Y, U, V), where the Y component refers to a grayscale intensity or luminance, and U and V refer to two chrominance components. The YUV color space is commonly seen in television applications. Another common color space is referred to as the RGB color space, wherein R, G and B refer to the Red, Green and Blue color components, respectively. The RGB color space is commonly seen in computer graphics representations, along with CYMB (cyan, yellow, magenta, and black) often used with computer printers.

Video compression is possible because an uncompressed video sequence contains redundancies and some of the video signal can be discarded without greatly affecting the resulting video. For example, each frame of a video sequence representing a stationary scene would be nearly identical to other frames in the video sequence. Most video compression routines attempt to remove the superfluous information so that the related image frames can be represented in terms of previous image frame(s), thus eliminating the need to transmit an entire image for each video frame. Alternatively, routines like motion JPEG, code each video frame separately and ignore temporal redundancy.

2. Known Compression Techniques

There have been numerous attempts at adequately compressing video imagery. These methods generally fall into the following two categories: 1) spatial redundancy reduction, and 2) temporal redundancy reduction.

2.1. Spatial Redundancy Reduction

Spatial redundancy reduction takes advantage of the correlation among neighboring pixels in order to derive a more efficient representation of the important information in an image frame. These methods are more appropriately termed still-image compression routines, as they generally address each frame in isolation, i.e., independent of other frames in the sequence. Because of this, they do not attempt to temporal, or frame-to-frame, redundancy. Common still-image compression schemes include JPEG, wavelets, and fractals.

2.1.1. JPEG/DCT Based Image Compression

One of the first commonly used methods of still-image compression was the direct cosine transformation ("DCT") compression system, which is at the heart of JPEG. DCT operates by representing each digital image frame as a series of cosine waves or frequencies and quantizing coefficients of the cosine series. The higher frequency coefficients are quantized more harshly than those of the lower frequencies. The result of the quantization is a large number of zero coefficients, which can be encoded very efficiently. However, JPEG and similar compression schemes do not address the crucial issue of temporal redundancy.

2.1.2. Wavelets

As a slight improvement to the DCT compression scheme, the wavelet transformation compression scheme was devised. This system is similar to the DCT, differing mainly in that an image frame is represented as a series of wavelets, or windowed oscillations, instead of as a series of cosine waves.

2.1.3. Fractals

Another technique is known as fractal compression. The goal of fractal compression is to take an image and determine a single function, or a set of functions, which fully describe(s) the image frame. A fractal is an object that is self-similar at different scales or resolutions, i.e., no matter what resolution one looks at, the object remains the same. In theory, where fractals allow simple equations to describe complex images, very high compression ratios should be achievable.

Unfortunately, fractal compression is not a viable method of general compression. The high compression ratios are only achievable for specially constructed images, and only with considerable help from a person guiding the compression process. In addition, fractal compression is very computationally intensive.

2.2. Temporal and Spatial Redundancy Reduction

Adequate motion video compression requires reduction of both temporal and spatial redundancies. Temporal redundancy can be reduced by replacing all or part of the bits representing the image of a frame with one or more references to other frames or portions of a frame. This allows a small number of bits to represent a larger number of bits. Block matching is the basis for most currently used effective means of temporal redundancy removal.

In block matching, an image frame is subdivided into uniform size blocks (more generally, into polygons), and each block is tracked from one frame to another and represented by a motion vector, instead of having the block re-coded and placed into the bitstream for a second time. Examples of compression routines that use block matching include MPEG and variants thereof.

MPEG encodes the first frame in a sequence of related frames in its entirety as a so-called intra-frame, or I-frame. An I-frame is a type of key frame, meaning an image frame that is completely self-contained and not described in relation to any other image frame. To create an I-frame, MPEG performs a still-image compression on the frame, including dividing the frame into 16 pixel by 16 pixel square blocks. Other (so-called "predicted") frames are encoded with respect to the I-frame by predicting corresponding blocks of the other frame in relation to that of the I-frame. That is, MPEG attempts to find each block of an I-frame within the other frame. For each block that still exists in the other frame, MPEG transmits the motion vector, or movement, of the block along with block identifying information. However, as a block moves from frame to frame, it may change slightly. The difference relative to the I-frame is known as residue. Additionally, as blocks move, previously hidden areas may become visible for the first time. These previously hidden areas are also known as residue. That is, the collective remaining information after the block motion is sent is known as the residue, which is coded using JPEG and included in the bitstream to complete the image frame.

Subsequent frames are predicted with respect to either the blocks of the I-frame or a preceding predicted frame. In addition, the prediction can be bi-directional, i.e., with reference to both preceding and subsequent I-frames or predicted frames. The prediction process continues until a new key frame is inserted, at which point a new I-frame is encoded and the process repeats itself.

Although state of the art, block matching is highly inefficient and fails to take advantage of the known general physical characteristics or other information inherent in the images. The block method is both arbitrary and crude, as the blocks do not have any relationship with real objects in the image. A given block may comprise a part of an object, a whole object, or even multiple dissimilar objects with unrelated motion. In addition, neighboring objects will often have similar motion. However, since blocks do not correspond to real objects, block-based systems cannot use this information to further reduce the bitstream.

Yet another major limitation of block-based matches arises because the residue created by block-based matching is generally noisy and patchy. Thus, block-based residues do not lend themselves to good compression via standard image compression schemes such as DCT, wavelets, or fractals.

2.3. Alternatives

It is well recognized that the state of the art needs improvement, specifically in that the block-based method is extremely inefficient and does not produce an optimally compressed bitstream for motion video information. To that end, the very latest compression schemes, such as MPEG4, allow for the inclusion of limited structural information, if available, of selected items within the frames rather than merely using arbitrary-sized blocks. While some compression gains are achieved, the associated overhead information is substantially increased because, in addition to the motion and residue information, these schemes require that structural or shape information for each object in a frame must also be sent to the receiver.

Additionally, as mentioned above, the current compression methods treat the residue as just another image frame to be compressed by JPEG using a fixed compression.

BRIEF SUMMARY OF THE INVENTION

A decoder decodes compressed video data to reconstruct video data from compressed video data comprising key frames and nonkey frames, including a current frame buffer for storing at least a part of a current frame being decoded, a reference frame buffer for storing at least a part of a reference frame, wherein the reference frame is a frame that was used in the encoding of the current frame and a frame that is decodable from the compressed video data prior to the decoding of the current frame, a segmenter that generates at least a part of a segmentation of the reference frame, wherein the segmentation is an assignment of some or all of the pixels of the reference frame to segments based on at least one of pixel color values of the pixels and location of the pixels in the reference frame, and a nonkey frame generator that regenerates, when the current frame is a nonkey frame, all or part of the current frame into the current frame buffer using kinetic information about the current frame from the compressed video data and using at least a part of the segmentation of the reference frame generated by the segmenter. The kinetic information might include segment translation relating the pixels of a segment in the reference frame to pixels in a translated position in the current frame. The segmentation done by the decoder for a given reference frame might be exactly the same, or different from, a segmentation done for the encoding of the given reference frame. Where the segmentation used in encoding the compressed video data can vary among a plurality of segmentation schemes, the decoder determines which segmentation scheme is used, possibly on a frame-by-frame basis, from selection indications in the compressed video data or from previously decoded video data. The decoder might also use partial segmentation information, segmentation hints, partial segment canonical information and/or canonical hints in its segmentation process. The decoder might also process segment-related metadata extracted the compressed video data.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a block diagram of a video stream processing system; FIG. 1(a) illustrates an example where video is compressed for transmission over a channel; FIG. 1(b) illustrates an example where video is compressed for storage.

FIG. 2 is a block diagram of an encoder according to embodiments of the present invention.

FIG. 3 is a diagram illustrating structure of a video stream according to embodiments of the present invention.

FIG. 4 is a diagram illustrating another variation of structure of a video stream.

FIG. 5 is a block diagram of a decoder according to embodiments of the present invention.

FIG. 6 is a block diagram of a portion of an encoder, such as the encoder of FIG. 2, including a modeller.

FIG. 7 is an illustration of exposed areas.

FIG. 8 is a flowchart of an encoding process.

FIG. 9 is a flowchart of a decoding process.

DETAILED DESCRIPTION OF THE INVENTION

FIG. 1 is a block diagram of a video stream processing system 10. System 10 accepts video data from any number of sources, encodes it using encoder 100 such that the video data is compressed (i.e., occupies fewer bits than the uncompressed video data) for transport or storage. System 10 includes a decoder 200 that receives the transported or stored compressed video data and decompresses for use by any number of video sinks (users).

Merely by way of example, possible video sources include a video camera, a video storage system (typically storing uncompressed, or partially compressed, video data), a high-speed channel, such as a cable link or broadcast link capable of transmitting uncompressed or partially compressed video data, or a video player, such as a VCR or DVD player. Possible video users, for example, might include a display device, such as a monitor or television, a video processor or video storage that can store decoded video data.

FIG. 1(a) illustrates an example where video is compressed for transmission over a channel 120. Channel 120, for example, could be a digital subscriber line (DSL), a cable modem, a dialup connection, broadcast, cable broadcast, satellite transmission, or the like. In such cases, the video is compressed so that it can be transmitted using available bandwidth efficiently

FIG. 1(b) illustrates an example of a system 20 where video is compressed for storage. As shown, encoder 100 encodes video data for storage in compressed video storage 130 for later retrieval by decoder 200. Storage 130 might be, for example, a hard drive, a memory card, a personal video recorder (PVR), RAM, CD, DVD, or any other suitable storage.

Note that the same encoder and decoder can be used for a transmission system as used for a storage system. Of course, the encoders and decoders could be different. The differences could be external, such as changing the output of the encoder to point to a storage device rather than a channel, but the changes could also be internal, such as changing the methods used by the encoder depending on whether or not the encoder's output is time critical. For example, if it is known a priori that the encoded video will not be read from storage right away, the encoder could trade off speed for improved compression.

In a basic operation, video data, usually uncompressed video data, is provided to encoder 100, which encodes the video data to form compressed video data that occupies fewer bits than the uncompressed video data, and preferably much fewer bits, and makes the uncompressed video data available to the decoder (via a channel, storage, or a combination thereof). The decoder in turn decompresses the compressed video data to arrive at an exactor approximate copy of the uncompressed video data provided to the input of the encoder.

FIG. 2 is a block diagram of encoder 100 according to embodiments of the present invention. As shown there, encoder 100 comprises a frame loader 202, a frame compressor 204, a motion matcher 206, a residue generator 208, an output scheduler 210 and a segmenter 220. Also shown are storage for data being processed, such as a frame buffer 230 for holding all or part of a current frame, frame buffer 232 for holding all or part of a reference frame, segment data set storage 234, kinetic information storage 236 for storing motion factors and other kinetic information, and residue data storage 238. Also shown, and explained below are a frame decompressor 240 and a frame regenerator 242.

Frame loader 202 is configured to receive uncompressed video in and provide the uncompressed video in a frame-by-frame manner to frame buffer 230. It should be understood that the video in could be partially compressed and could be in any of the variety of formats. As shown, frame buffer 230 is coupled to frame processor 204, motion matcher 206, and residue generator 208 to provide all or part of the information embodied in the current frame stored in frame buffer 230.

As used herein, the term "current frame" refers to a frame of video being processed by the encoder. In a typical operation, a frame is loaded into frame buffer 230 and becomes the current frame, that current frame is processed and another frame is loaded into frame buffer 230 and that frame would then be the current frame. The other frame buffer, frame buffer 232, is coupled to motion matcher 206 and residue generator 208 to provide all of part of the information content of the reference frame. Frame buffer 232 is also coupled to a segmenter 220, which is in term coupled to storage 234, thereby allowing segmenter 220 to generate and store a segment data set associated with the reference frame. Storage 234 is coupled to motion matcher 206 to allow motion matcher 206 to obtain all or part of a segment data set.

Residue generator 208 is coupled to frame buffers 230 and 232, as well as kinetic information storage 236 such that residue generator 208 can use information stored therein to generate residue data stored in residue storage 238.

As used herein, the term "reference frame" refers to a frame whose information content is used, at least in part, in the encoding of the current frame. In the general case, the current frame might be encoded with reference to more than one reference frame, but for clarity encoder 100 in its operation is described here where only one reference frame is needed. As used herein, the term "key frame" refers to a frame that is encoded such that it can be decoded without reference to other frames. Note that reference frames are not required to be key frames but can be frames that are encoded with reference to yet other reference frames.

An operation of encoding frames will now be described beginning with the encoder in an initial state. Initially, frame buffer 230 and frame buffer 232 are empty. Frame loader 202 loads frame buffer 230 with a frame of the input video. Since there is no reference frame at this point, that frame in frame buffer 230 would naturally be encoded as a key frame. However, it should be understood, that in some variations, a reference frame might be preloaded, in which case the first frame processed does not need to be the key frame.

Continuing the description of the operation, frame compressor 204 obtains the frame from frame buffer 230 and compresses it into an encoded frame. Such a compression could be lossy or lossless (which, technically, is just a special case of lossy compression). That encoded frame is then provided to output scheduler 210 to form the output video sequence. The encoded frame is also provided to a frame decompressor 240 that decompresses the frame and provides it to frame regenerator 242. The output of frame regenerator 242 is stored in frame buffer 232 as the reference frame to be used for subsequent encoding steps. Of course, if the output of frame compressor 204 is known to be a lossless compression, such that the outputted frame decompressor 240 can be an exact replica of the current frame, then frame decompressor and frame regenerator 242 can be eliminated and instead the contents of frame 230 could simply be copied into frame buffer 232 once the current frame is encoded. Either way, once the current frame has been processed, frame loader 202 can load another frame into frame buffer 230 and that frame would become the current frame to be encoded. At this point, a reference frame is available in frame buffer 232 and a process of encoding the current frame while taking reference to the reference frame will now be described.

Where the current frame is encoded with reference to the reference frame, the use of frame compressor 204 is not required. Instead, motion matcher 206 can operate on the current frame, the reference frame, and segment data about the reference frame generated by segmenter 220, to output from motion matcher 206 kinetic information, which are stored in kinetic information storage 236. The operation of motion matcher 206 is described in more detail below. The kinetic information output by motion matcher 206 relates to changes in segments from the reference frame to the current frame. In other words, the reference frame is segmented such that areas of the reference frame are associated with segment identifiers, thus resulting in segments having segment boundaries bounding pixels of the reference frame. These segments can be matched to pixels in the current frame and kinetic information about the segments can be identified. Merely one example of information about a segment might be a determination that a particular segment of the reference frame is suitably represented by a similar collection of pixels in the current frame, possibly offset in location and/or color values. Once as many segments to be matched are matched from the reference frame to pixels in the current frame, the kinetic information associated with the current frame can be provided to output scheduler 210 to form part or all of the encoding of the current frame, as well as being provided to residue generator 208.

Residue generator 208 can then, from the kinetic information, the current frame, and the reference frame, determine what differences would remain between the current frame and the reference frame after the kinetic information is applied to the segments of the reference frame. Such a residue might include changes in position, shape, or color value of pixels associated with a segment that are not already accounted for in the kinetic information. Residue might also include exposed area. An exposed area would occur, for example, where the segments represent objects in a scene and those objects are moving between the reference frame and the current frame. If that were the case, there would be some pixels in the current frame that are not associated with any segment of the reference frame because the objects or portions of objects represented by those pixels of the current frame were objects or portions of objects obscured by other objects in the reference frame. This is illustrated by FIG. 7, which shows exposed areas 704 resulting from the frame-to-frame motion of segments 702. Thus, the pixel values for exposed areas, and other residues might form the residue data output from residue generator 208 to storage 238.

Residue data 238 is provided to output scheduler 210 to form another part of the encoding of the current frame. As should be apparent from this description, if the residue data is an exact representation of the difference between the reference frame and the current frame after the kinetic information is applied to the reference frame, then the output of output scheduler 210 contains enough information such that the current frame could be exactly reconstructed from nothing more than prior knowledge of the exact contents of the reference frame, the kinetic information relating the reference frame and the current frame and the resulting residue data. However, in some cases exact replication of the current frame is not always required, in which case residue data 238 might be not the exact difference. If that is the case, then frame regenerator 242 is preferably used to regenerate the current frame from the reference frame, the kinetic information, and the residue data, so that the reference frame used for subsequent encoding is not the exact reference frame, but the reference frame as it is known to be recoverable at the decoder.

Encoder 100 can repeat a process with subsequent frames becoming the current frame, until the video is completely encoded. Although an encoder might always have a reference frame available, the encoder could choose to ignore the reference frame and encode the current frame as a key frame. This might happen, for example, as the result of an external trigger, upon detection of a scene change, or after the encoder has determined that the residue data is such that encoding the current frame with the key frame would be more efficient. In instances where frames are not always losslessly compressed, the encoder might decide not to use the reference frame if it determines that sufficient losses have accumulated in the process of encoding frames and using those frames as references for subsequent encodings of frames that are used for references, etc.

FIG. 3 is a diagram illustrating the structure of a compressed video stream, as might be output by the encoder shown in FIG. 2. As illustrated there, a frame K, followed by several non-key frames, such as frame K+1 and K+2.As illustrated, frames K+1 and K+2 can be fully represented by kinetic data, model data (explained in further detail below with reference to FIG. 6) and residue data. The kinetic data is shown, by way of example, is further detailed as comprising data elements associated with segments of the current frame's reference frame. In this example, the reference frame for frame K+1 might be frame K.

As an example of the kinetic information associated with each segment, the data elements there shown include translation data, z-order data, affine data, non-linear data, lighting data, and other data. The encoded video data stream might also include, either in a header applicable to all frames, or on a frame-by-frame basis, an indication of which of a plurality of segmentations schemes was used, partial segmentation information or segmentation hints and/or partial canonical information about how the segments are ordered or labelled with index values. Typically, canonical information is not needed in the compressed video as the decoder should normally be able to order segments in the same way as the encoder did. The encoded video data stream might have some of the kinetic information associated with segments in a segmentation omitted if it can be predicted by the decoder.

The encoding of difference between rough frame and raw frame can be a novel residue frame as described herein or just a simple difference frame conventionally compressed. The residue frame can be encoded as a frame or might be encoded as segment-by-segment residue.

FIG. 4 illustrates additional data constructs that might be present in the compressed video information. The additional information shown there includes a field for indicating the segmentation scheme used for the current frame, partial segmentation information and/or hints about segmentation usable by a decoder, canonicalization information indicating an ordering of the segments and other hints that might possibly be present.

One canonicalization approach is to assign segment indices to segments based on position in the frame. For example, the segment that includes the pixel in the upper left corner of the frame could be segment 1, the segment containing the next leftmost pixel in the top row that does not belong to segment 1 could be segment 2, and so on through all the rows of pixels. If this approach is used, the encoder and decoder can independently determine the same canonicalization as they segment their own copies of a reference frame.

FIG. 5 is a block diagram of a decoder according to embodiments of the present invention. As shown there, a decoder includes an input scheduler 502, a frame decompressor 504, a segmenter 506, a current frame reconstructor 508 and a processor 510. Input scheduler 502 receives compressed video information, from a channel, from storage or from another source. For key frames, the video data can be provided to frame decompressor 504 for decoding. Frame decompressor 504 can then decode the frame and store it in a frame buffer 520. For nonkey frames, the video data can be provided to kinetic information storage 522 and residue storage 524. Other storage shown includes segment dataset storage 526 and approximation frame storage 528.

In operation, when a key frame is received, input scheduler 502 provides it to frame decompressor 504, which decompresses the key frame, stores it in frame buffer 320 and that uncompressed frame can be output, for the use of the video user coupled to the decoder. When a nonkey frame is received, other elements of the decoder process the frame. In some embodiments, the decoder might determine that the next frame is a key frame by examining a flag in the compressed video data associated with the frame.

Once a frame is decoded and output, it can be the reference frame, stored in frame buffer 520. The decoder includes a segmenter 506 that can segment the frame in frame buffer 520 into a set of segments. The segmentation results are stored in as a segment dataset in storage 526. There are many ways to structure the results. One such method is to identify each segment with an index and a segment boundary, which is a closed shape that encloses one or more pixels of the reference frame (although degenerate, zero-pixel segments should not be ruled out). Another method is to associate each pixel in the reference frame with a segment index. However it is stored, it should be noted that the decoder can generate the segment dataset, at least approximately, without requiring any additional data from the encoder, which might increase the size of the compressed video data. Thus, the decoder doing its own segmentation allows for greater compression than if the decoder relied on the encoder's segmentation results.

When a nonkey frame is processed, the reference frame for that nonkey frame is present in frame buffer 520. As explained above, a nonkey frame might have been encoded with reference to more than one reference frame, but for clarity, this explanation relates to the case where only one reference frame is needed for decoding a nonkey frame. It should be noted that the reference frame need not be the frame immediately prior to the nonkey frame in the video sequence and need not even be prior to the nonkey frame being decoded.

As illustrated in FIG. 3, a nonkey frame is encoded by kinetic data, residue data and possibly other data. In the decoder shown in FIG. 5, the kinetic data for the nonkey frame is supplied to kinetic data storage 522 and the residue data for the nonkey frame is supplied to residue storage 524. The current frame reconstructor is coupled to receive all or part of the reference frame information from frame buffer 520, all or part of the segment dataset for that reference frame, and all or part of the kinetic information for the current frame. The current frame reconstructor is configured to generate a rough frame, stored in storage 528, from that information.

The rough frame is an approximation, although it might be exact, of the current frame from the reference frame, its segmentation and kinetic information relating the segments of the reference frame to the current frame. Note that the segmentation information was not required to be included in the overhead of the compressed video, but instead could have been generated entirely by the decoder. In some embodiments, decoder effort might be more of a concern than efficient bandwidth usage, in which case the encoder might include in the compressed video some partial segmentation information or hints to assist the decoder in generating its own segment dataset.

Processor 510 is configured to accept the rough frame and residue information to form a regenerated current frame, which can then be output by the decoder. The regenerated current frame might then be used as a reference frame for later received (but not necessarily later in the video sequence) frames.

Each of the components shown in FIG. 5 might be implemented in special purpose hardware, programmable hardware or software. For example, each of components 502, 504, 506, 508 and 510 might be portions of one program operating in an input data stream. Each of the storage elements 520, 522, 524, 526 and 528 might be separate storage areas, or might be separate portions of a common storage or memory. In some cases, where it is more efficient, the frame buffers might change roles rather than having the data from one frame buffer copied to another frame buffer.

Generally, the operations of the components of the decoder perform the inverse of the operation performed by the encoder. For example, where the residue data is simply the a compressed difference frame of the difference between a rough frame and the current frame, processor 510 might simply read the residue data for the current frame, decompress it and add it back to the rough frame to result in a reconstruction of the current frame.

Further compression might be possible by special coding of the kinetic information as well as modelling the rough residue representing the difference between the rough frame described above and the current frame. For example, without further processing, the rough residue might contain data about exposed areas. A frame is an image of a real or generated scene and typically contains objects. In some segmentation schemes, segment boundaries follow boundaries of objects in the scene. If relative motion of an object is present between a current frame and a reference frame, there will likely be a portion of the current frame that represents an object or background that does not correspond to a segment in the reference frame because that object or background was obscured by another object in the reference frame but not in the current frame. That area is referred to herein as an "exposed area." An example of exposed areas is illustrated in FIG. 7.

FIG. 6 is a block diagram of a portion of a decoder that models the rough residue to form model data and a remaining residue that is hopefully more compressible than the rough residue. As shown in that figure, the kinetic information is provided to a modeller 602 that also has access to the current frame, the reference frame and the segment dataset of the reference frame. Modeller 602 generates model data 606 that is output as part of the compressed data stream (see FIG. 3, for an example of placement) and is provided to a residue generator that would generate the remaining residue.

To further compress the compressed data stream, a motion vector coder 604 codes the motion vectors (and possibly other kinetic information) to reduce redundancy in the motion vectors, prior to the information being included in the output video data.

Referring now to FIG. 8, a flowchart of an encoding process is there shown. The process begins with receiving a key frame (step S1) and compressing the key frame to form an encoded key frame that is output as the output of the encoder (S2). The compression can be lossy or lossless (as used herein, lossy compression can include zero loss (lossless) compression). If the compression results in loss of information (S3), the encoded key frame is decompressed to form a reconstructed key frame to be used in subsequent steps in place of the key frame. This allows the decoder to follow along with the encoder's coding process without the encoder having to convey all of its state, because the encoder will operate on what the decoder has, not what the encoder has (although in the lossless case, those will be the same.

At step S5, the frame is now considered a reference frame and a segmentation is generated (S5). Segmentation can be done using known methods. Some approaches to segmentation are shown in U.S. Pat. No. 6,778,698 (U.S. patent application Ser. No. 09/591,438 filed Jun. 9, 2000 and entitled "Method and Apparatus for Digital Image Segmentation"), which is commonly owned with the present application and is incorporated herein for all purposes. In some cases, an encoder might select among a plurality of segmentation schemes, so the encoder selects a scheme. If the scheme is determinable from information that the encoder knows the decoder has, such as the content of prior processed frames, the encoder need not include an indication of the scheme selection in the output video data. The scheme selected might depend on the image content, as some schemes might work better than others for a given image.

At step S6, the encoder receives a second frame that becomes the current frame. Here we assume that the second frame is not a key frame. If it were, the process would loop back to step S1. If a key frame following a key frame is detected early enough, the segmentation of the first key frame might be omitted if it would not get used as a reference for any nonkey frames. Note that the first frame and second frame need not be consecutive and the first frame need not precede the second frame in a video sequence.

Since we assume that the current frame is a nonkey frame, it is processed as such. First, segments of the reference frame (the key frame described above or a nonkey frame from a prior loop) are matched to the pixels of the current frame (S7) to form a segment mapping. The current frame need not be segmented at this point--the mapping is from segments of the reference frame to pixels of the current frame. The process of motion matching might be performed in one or more methods described in U.S. Pat. No. 0,584,213 (U.S. patent application Ser. No. 09/912,743 filed Jul. 23, 2001 and entitled "Motion Matching Method"), which is commonly owned with the present application and is incorporated herein for all purposes.

Next, kinetic information for the segments of the segmentation is generated (S8). The kinetic information for a segment can be simply a motion vector representing an (X, Y) translation of a segment between the reference frame and the current frame, but might include more information. For example, the kinetic information for a segment might indicate other information about the segment between the reference frame to the raw second frame, where the changes might include an indication of a z-order of the segment (relative or absolute; determinable by examining changes in the segment from frame to frame), deformation (rotation, dilation, other affine transformation or a nonlinear transformation defined by a set of deformation parameters), lighting changes (an additive offset in one, two or three color planes, such as an additive offset in a luminance plane and/or a multiplicative offset in one, two or three color planes), and/or residue by segment, or pixel color value offset (linear or nonlinear), such as a color offset for the segment and a multiplicative offset for segment. While z-ordering might be considered an characteristic of a specific image rather than an indication of the changes in a segment from one frame to the next, here "z-ordering" refers to z-ordering as determined by examining the changes of two segments relative to each other from one frame to the next.

Once the kinetic information is generated, a rough frame can be generated (S9). A rough frame is the frame that would result by applying the segments of the reference frame generated in step S5 and the kinetic information generated in step S8 to the reference frame. The rough frame, or the difference between the rough frame and the current frame, can be further processed to determine model data, as might result from exposed area processing and applying non-motion related kinetic information. In some embodiments, the model data is not generated or used.

Whether model data is used or not, the remaining difference between the rough frame and the current frame is generally referred to herein as the residue. A residue frame is generated (S11), if not already available, from the current frame by subtracting out the image portions or pixel values represented by the kinetic data applied to the segments of the reference frame and then subtracting out the image portions or pixel values represented by the model data, if used. Alternatively, the residue frame could be generated by subtracting the rough frame from the current frame.

This residue frame is compressed (S12), and if the compression is not lossless (S13), the resulting compressed residue frame is decompressed (S14) for use in later steps. If the compression is lossless, the compressed residue frame does not need to be decompressed, as the uncompressed residue frame could be used in the later steps. In some cases, these steps could be omitted regardless of how the compression is done, but preferably the later steps wherein the residue frame is used to generate the reference frame used for later compressions would use the residue frame as it would exist at the decoder, even if that is not exactly what the encoder started with.

Once all of that is done, the encoder can output the compressed current frame as a compressed nonkey frame comprising the set of kinetic information, model data (if used) and a compressed residue frame (S15). Then, the encoder determines whether the next frame will be a key frame (S16). This decision could be made based on some external trigger, a determination that the current frame is from a different scene than the reference frame (scene change detect), or based on the results of compressing the current frame. Although not shown in the figure, the process might include further logic to discard the compressed nonkey frame generated for the current frame if the compression is not good enough and repeat the process with the current frame being treated as a key frame.

If the next frame is a nonkey frame, the current frame is labelled as the reference frame (possibly moved into a frame buffer allocated for the reference frame). Where the compression is not lossless, preferably the decompressed current frame is used as the reference frame instead of the original uncompressed current frame, so that the encoder and the decoder are in sync. The process then continues, looping back to step S5, where the new reference frame is segmented and another frame is received, to become the now current frame. In some embodiments, the subsequent frame uses a frame other than the immediately prior current frame as its reference frame. In some embodiments, more than one prior encoded frame is used as the reference.

If the next frame is to be a key frame, the process loops back to step S1 and repeats from there, with t


Free Web Sudoku Puzzles.
Solve with your browser.
          8 2 7  
9     6 1     3  
  6       4      
7   6 5       1  
                 
  1       3 8   9
      1       2  
  3     7 5     8
  9 1 2          
What is it?



Add Your Site · Terms Of Service · Privacy Policy


DISCLAIMER
Linkgrinder is a free service that searches the Internet and indexes all files found so that you may search quickly and easily for shared files. These files are created and made available individually by users whose identity we are not aware of and who we have no control over. In essence we function like a search engine tool; these files ARE NOT STORED OR SERVED BY OUR NETWORK. We are not responsible for any materials obtained by using our service. We do not monitor any of the contents of these files. These files may contain viruses, illegal materials, materials inappropriate for minors, offensive files and the like. BY USING OUR SERVICE, YOU ASSUME FULL RESPONSIBILITY FOR DOWNLOADING THESE MATERIALS AND WILL INDEMNIFY US FOR ANY DAMAGES THAT MAY BE INCURRED.

For More Specific Information VIEW OUR TERMS OF SERVICE.

Thank you and Enjoy!