Title: Method and apparatus for eliminating motion artifacts from video
Abstract: A method and apparatus for detecting and correcting motion artifacts in interlaced video signal converted for progressive video display. A correction is applied where interlaced video material is determined to originate from film source, thereby having been converted to video using a process known as 3-2 pulldown. Where the video material is not a result of the 3-2 pulldown process, a check is made for the presence of "pixel motion" so that corrections may be applied to smooth out the pixel motion. To determine 3-2 pulldown or field motion, a video field is compared to the field prior to the previous field to generate field error. Field errors are generated for five consecutive fields and a local minimum error repeated every five fields indicate the origination of the video material from film source using the 3-2 pulldown process. Upon confirmation of 3-2 pulldown, the video material is modified to correct for the mixing of two film frames into one interlaced video frame by assuring that the two fields of the de-interlaced video frame contain data from the same film frame. Where the video material did not originate from a film source, but pixel motion is detected, the pixel motion is smoothed out by an averaging method. The odd and even fields of the resulting video data are subsequently combined to form a progressive video material.
Patent Number: 6,839,094 Issued on 01/04/2005 to Tang,   et al.
| Inventors:
|
Tang; Che Wing (Baldwin Park, CA);
Truong; Dung Duc (El Monte, CA)
|
| Assignee:
|
RGB Systems, Inc. (Anaheim, CA)
|
| Appl. No.:
|
738281 |
| Filed:
|
December 14, 2000 |
| Current U.S. Class: |
348/607; 348/441; 348/448; 348/452; 348/458; 348/558; 348/700 |
| Intern'l Class: |
H04N 005/08; H04N005/14; H04N005/46; H04N007/01; H04N011/20 |
| Field of Search: |
348/607,441,443,448,452,458,459,526,527,558,700,701
386/1,4,52,131
|
References Cited [Referenced By]
U.S. Patent Documents
| 4881125 | Nov., 1989 | Krause.
| |
| 5317398 | May., 1994 | Casavant et al.
| |
| 5329309 | Jul., 1994 | Doricott et al.
| |
| 5446497 | Aug., 1995 | Keating et al.
| |
| 5452011 | Sep., 1995 | Martin et al.
| |
| 5508750 | Apr., 1996 | Hewlett et al.
| |
| 5550592 | Aug., 1996 | Markandey et al.
| |
| 5606373 | Feb., 1997 | Dopp et al.
| |
| 5821991 | Oct., 1998 | Kwok.
| |
| 5828786 | Oct., 1998 | Rao et al. | 382/236.
|
| 5844618 | Dec., 1998 | Horiike et al.
| |
| 5852473 | Dec., 1998 | Horne et al.
| |
| 5872600 | Feb., 1999 | Suzuki | 348/459.
|
| 5929902 | Jul., 1999 | Kwok.
| |
| 5930445 | Jul., 1999 | Peters et al. | 386/52.
|
| 6055018 | Apr., 2000 | Swan | 348/448.
|
| 6058140 | May., 2000 | Smolenski.
| |
| 6144410 | Nov., 2000 | Kikuchi et al. | 348/441.
|
| 6157412 | Dec., 2000 | Westerman et al. | 348/558.
|
| 6201577 | Mar., 2001 | Swartz | 348/558.
|
| 6282245 | Aug., 2001 | Oishi et al. | 375/240.
|
| 6340990 | Jan., 2002 | Wilson | 348/448.
|
| 6380978 | Apr., 2002 | Adams et al. | 348/452.
|
| 6408024 | Jun., 2002 | Nagao et al. | 375/240.
|
| 6469745 | Oct., 2002 | Yamada et al. | 348/558.
|
| 6525774 | Feb., 2003 | Sugihara | 348/459.
|
| 6542199 | Apr., 2003 | Manbeck et al. | 348/459.
|
| 6559890 | May., 2003 | Holland et al. | 348/441.
|
| 6563550 | May., 2003 | Kahn et al. | 348/700.
|
| 6670996 | Dec., 2003 | Jiang | 348/558.
|
| 2002/0149703 | Oct., 2002 | Adams et al. | 348/700.
|
| 2003/0098924 | May., 2003 | Adams et al. | 348/448.
|
| 2003/0193614 | Oct., 2003 | Holland et al. | 348/441.
|
| Foreign Patent Documents |
| 1065879 | Jan., 2001 | EP.
| |
| WO 99/20040 | Apr., 1999 | WO.
| |
Primary Examiner: Yenke; Brian P.
Attorney, Agent or Firm: The Hecker Law Group, PLC
Claims
What is claimed is:
1. A method for eliminating motion artifacts from video signals during
conversion from interlaced to progressive comprising:
receiving a first video signal, said first video signal comprising one or
more video frames arranged in sequence, each of said one or more video
frames having a first field and a second field;
determining if said first video signal originates from a film source by
examining successive video fields of said first video signal to locate a
repeat field caused by a 3-2 pulldown conversion;
generating a frame of a second video signal for each frame of said one or
more frames of said first video signal, said frame of said second video
signal having a first component and a second component using said first
field and said second field of said first video signal such that said
frame of said second video signal comprises pixel data from a common film
frame if said first video signal originates from said film source, wherein
said generating said frame of said second video signal comprises:
generating a first temporary video signal having fields corresponding to
current fields of said first video signal;
generating a second temporary video signal having fields corresponding to
fields of said first video signal delayed by one field;
generating a third temporary video signal having fields corresponding to
fields of said first video signal delayed by two fields;
generating a counter for counting fields of said first video signal with
values starting from zero at detection of said repeat field and
incrementing thereafter by one, such that said repeat field is count zero,
a next field after said repeat field is count one, a next field after said
count one is count two, a next field after said count two is count three,
and a next field after said count three is count four;
generating said frame of said second video signal by using said second
temporary video signal and said third temporary video signal when said
values of said counter are zero, two, and three;
generating said frame of said second video signal by using said first
temporary video signal and said second temporary video signal when said
values of said counter are one and four.
2. The method of claim 1 wherein data in said second video signal is
arranged such that said first component of said second video signal
comprises luminance data from said first video signal, and said second
component of said second video signal comprises chrominance data from said
first video signal.
3. A computer program product comprising:
a computer readable medium having computer program code embodied therein
for eliminating motion artifacts from video signals during conversion from
interlaced to progressive, said computer readable medium comprising
computer program code configured to cause a computer to:
receive an interlaced video signal comprising one or more video frames
arranged in sequence, each of said one or more video frames having a first
field and a second field;
determine if said first video signal originates from a film source by
examining successive video fields of said first video signal to locate a
repeat field caused by a 3-2 pulldown conversion;
generate a frame of a second video signal for each frame of said one or
more frames of said first video signal, said frame of said second video
signal having a first component and a second component using said first
field and said second field of said first video signal such that said
frame of said second video signal comprises pixel data from a common film
frame if said first video signal originates from said film source, wherein
said computer program code configured to cause a computer to generate said
frame of said second video signal comprises computer program code
configured to cause a computer to:
generate a first temporary video signal having fields corresponding to
current fields of said first video signal;
generate a second temporary video signal having fields corresponding to
fields of said first video signal delayed by one field;
generate a third temporary video signal having fields corresponding to
fields of said first video signal delayed by two fields;
generate a counter for counting fields of said first video signal with
values starting from zero at detection of said repeat field and
incrementing thereafter by one, such that said repeat field is count zero,
a next field after said repeat field is count one, a next field after said
count one is count two, a next field after said count two is count three,
and a next field after said count three is count four;
generate said frame of said second video signal by using said second
temporary video signal and said third temporary video signal when said
values of said counter are zero, two, and three;
generate said frame of said second video signal by using said first
temporary video signal and said second temporary video signal when said
values of said counter are one and four.
4. The computer program product of claim 3 wherein data in said second
video signal is arranged such that said first component of said second
video signal comprises luminance data from said first video signal, and
said second component of said second video signal comprises chrominance
data from said first video signal.
5. An apparatus for eliminating motion artifacts from video signals
comprising:
a digitizer unit to convert a first video signal in analog form to digital
form;
a memory unit to store said digital form;
a processing unit, said processing unit having computer program code, said
computer program code comprising:
a method receiving said digital form of said first video signal, said first
video signal comprising one or more video frames arranged in sequence,
each of said one or more video frames having a first field and a second
field;
a method determining if said first video signal originates from a film
source by examining successive video fields of said first video signal to
locate a repeat field caused by a 3-2 pulldown conversion;
a method generating a frame of a second video signal for each frame of said
one or more frames of said first video signal, said frame of said second
video signal having a first component and a second component using said
first field and said second field of said first video signal such that
said frame of said second video signal comprises pixel data from a common
film frame if said first video signal originates from said film source,
wherein said method for generating said frame of said second video signal
comprises:
generating a first temporary video signal having fields corresponding to
current fields of said first video signal;
generating a second temporary video signal having fields corresponding to
fields of said first video signal delayed by one field;
generating a third temporary video signal having fields corresponding to
fields of said first video signal delayed by two fields;
generating a counter for counting fields of said first video signal with
values starting from zero at detection of said repeat field and
incrementing thereafter by one, such that said repeat field is count zero,
a next field after said repeat field is count one, a next field after said
count one is count two, a next field after said count two is count three,
and a next field after said count three is count four;
generating said frame of said second video signal by using said second
temporary video signal and said third temporary video signal when said
values of said counter are zero, two, and three;
generating said frame of said second video signal by using said first
temporary video signal and said second temporary video signal when said
values of said counter are one and four.
6. The apparatus of claim 5 wherein data in said second video signal is
arranged such that said first component of said second video signal
comprises luminance data from said first video signal, and said second
component of said second video signal comprises chrominance data from said
first video signal.
7. A method fur eliminating motion artifacts from video signals during
conversion from interlaced to progressive comprising:
receiving an interlaced video signal comprising a plurality of video
fields;
determining if said interlaced video signal originates from a film source
by examining successive video fields of said interlaced video signal to
locate a repeat field caused by a 3-2 pulldown conversion;
generating a field of progressive video signal having a first component and
a second component for each of said plurality of video fields of said
interlaced video signal by processing to remove pixel motion if said
interlaced video signal did not originate from said film source; and
generating said field of progressive video signal having a first component
and a second component for each field of said plurality of video fields of
said interlaced video signal if said interlaced video signal originates
from said film source, comprising:
starting a five field counter at location of said repeat field, said five
field counter counting from zero to four and then restarting; generating
said field of said progressive video signal by using a previous field and
a field prior to said previous field of said interlaced video signal for
said first component and said second component when said field counter is
zero, two, and three; and
generating said field of said progressive video signal by using a current
field and said previous field of said interlaced video signal for said
first component said second component when said field counter is one and
four.
8. The method of claim 7 wherein said examining successive video fields to
locate a repeat field comprises:
selecting a field of said interlaced video signal to process;
generating field error for each of five successive fields, wherein said
five successive fields comprise said selected field and four previous
fields thereafter;
declaring a repeat field at said selected field if said field error in a
third field of said five successive fields is a local minimum among said
five successive fields.
9. The method of claim 8 wherein said declaring a repeat field comprises:
detecting a repeat field; and
confirming said repeat field occurring every five fields thereafter.
10. The method of claim 8 wherein said generating field error comprises:
selecting a subset of pixels in a field such that said subset excludes
pixels having subtitles;
obtaining the absolute value of the sum of the difference in pixel
intensity between said subset in a current field and said subset in a
field prior to the previous field.
11. The method of claim 8 wherein said generating field error comprises:
selecting a subset of pixels in a field such that said subset excludes
pixels having subtitles;
obtaining the sum of the absolute value of the difference in pixel
intensity between said subset in a current field and said subset in a
field prior to the previous field.
12. The method of claim 7 wherein said examining successive video fields to
locate a repeat field comprises:
selecting a field of said interlaced video signal to process;
generating field error for each of five successive fields, wherein said
five successive fields comprise said selected field and four previous
fields thereafter;
declaring a repeat field at said selected field if said field error in a
third field of said five successive fields is less than a predetermined
threshold.
13. The method of claim 7 further comprising:
determining if pixel motion is present in said interlaced video signal when
said first video signal did not originate from said film source, wherein
said pixel motion is determined for a selected pixel in a selected field;
wherein said processing to remove pixel motion comprises replacing said
selected pixel with an average of a pixel directly above and a pixel
directly below said selected pixel in a previous field to said selected
field if said pixel motion is present.
14. The method of claim 13 wherein said determining if pixel motion is
present comprises:
comparing the difference between pixels in said selected field with pixels
from a field prior to said previous field.
15. The method of claim 7 further comprising:
scaling said progressive video signal to generate a desired video data.
16. The method of claim 15 further comprising:
means for converting said desired video data for output on analog devices.
17. A computer program product comprising:
a computer readable medium having computer program code embodied therein
for eliminating motion artifacts from video signals during conversion from
interlaced to progressive, said computer readable medium comprising
computer program code configured to cause a computer to:
receive an interlaced video signal comprising a plurality of video fields;
determining if said interlaced video signal originates from a film source
by examining successive video fields of said interlaced video signal to
locate a repeat field caused by a 3-2 pulldown conversion;
generate a field of progressive video signal having a first component and a
second component for each of said plurality of video fields of said
interlaced video signal by processing to remove pixel motion if said
interlaced video signal did not originate from said film source; and
generate said field of progressive video signal having a first component
and a second component for each field of said plurality of video fields of
said interlaced video signal if said interlaced video signal originates
from said film source, comprising computer program code configured to
cause a computer to:
start a five field counter at location of said repeat field, said five
field counter counting from zero to four and then restarting;
generate said field of said progressive video signal by using a previous
field and a field prior to said previous field of said interlaced video
signal for said first component and said second component when said field
counter is zero, two, and three; and
generate said field of said progressive video signal by using a current
field and said previous field of said interlaced video signal for said
first component said second component when said field counter is one and
four.
18. The computer program product of claim 7 wherein said examining
successive video fields to locate a repeat field comprises computer
program code configured to cause a computer to:
select a field of said interlaced video signal to process;
generate field error for each of five successive fields, wherein said five
successive fields comprise said selected field and four previous fields
thereafter;
declare a repeat field at said selected field if said field error in a
third field of said five successive fields is a local minimum among said
five successive fields.
19. The computer program product of claim 18 wherein said computer program
code configured to cause a computer to declare a repeat field comprises
computer program code configured to cause a computer to:
detect a repeat field; and
confirm said repeat field occurring every five fields thereafter.
20. The computer program product of claim 18 wherein said generating field
error comprises computer program code configured to cause a computer to:
select a subset of pixels in a field such that said subset excludes pixels
having subtitles;
obtain the absolute value of the sum of the difference in pixel intensity
between said subset in a current field and said subset in a field prior to
the previous field.
21. The computer program product of claim 18 wherein said generating field
error comprises computer program code configured to cause a computer to:
select a subset of pixels in a field such that said subset excludes pixels
having subtitles;
obtain the sum of the absolute value of the difference in pixel intensity
between said subset in a current field and said subset in a field prior to
the previous field.
22. The computer program product of claim 17 wherein said examining
successive video fields to locate a repeat field comprises computer
program code configured to cause a computer to:
select a field of said interlaced video signal to process;
generate field error for each of five successive fields, wherein said five
successive fields comprise said selected field and four previous fields
thereafter;
declare a repeat field at said selected field if said field error in a
third field of said five successive fields is less than a predetermined
threshold.
23. The computer program product of claim 17 further comprising computer
program code configured to cause a computer to:
determine if pixel motion is present in said interlaced video signal when
said first video signal did not originate from said film source, wherein
said pixel motion is determined for a selected pixel in a selected field;
wherein said processing to remove pixel motion comprising replacing said
selected pixel with an average of a pixel directly above and a pixel
directly below said selected pixel in a previous field to said selected
field if said pixel motion is present.
24. The computer program product of claim 23 wherein said determining if
pixel motion is present comprises:
comparing the difference between pixels in said selected field with pixels
from a field prior to said previous field.
25. The computer program product of claim 17 further comprising computer
program code configured to cause a computer to:
scale said progressive video signal to generate a desired video data.
26. The computer program product of claim 25 further comprising computer
program code configured to cause a computer to:
means for converting said desired video data for output on analog devises.
27. An apparatus for eliminating motion artifacts from video signals during
conversion from interlaced to progressive comprising:
a digitizer unit to convert an interlaced video signal in analog form to
digital form;
a memory unit to store said digital form;
a processing unit, said processing unit having computer program code, said
computer program code comprising:
a method receiving an interlaced video signal comprising a plurality of
video fields;
a method determining if said interlaced video signal originates from a film
source by examining successive video fields of said interlaced video
signal to locate a repeat field caused by a 3-2 pulldown conversion;
a method generating a field of progressive video signal having a first
component and a second component for each of said plurality of video
fields of said interlaced video signal by processing to remove pixel
motion if said interlaced video signal did not originate from said film
source; and
a method generating said field of progressive video signal having a first
component and a second component for each field of said plurality of video
fields of said interlaced video signal if said interlaced video signal
originates from said film source, comprising:
starting a five field counter at location of said repeat field, said five
field counter counting from zero to four and then restarting;
generating said field of said progressive video signal by using a previous
field and a field prior to said previous field of said interlaced video
signal for said first component and said second component when said field
counter is zero, two, and three; and
generating said field of said progressive video signal by using a current
field and said previous field of said interlaced video signal for said
first component said second component when said field counter is one and
four.
28. The apparatus of claim 27 wherein said examining successive video
fields to locate a repeat field comprises:
selecting a field of said interlaced video signal to process;
generating field error for each of five successive fields, wherein said
five successive fields comprise said selected field and four previous
fields thereafter;
declaring a repeat field at said selected field if said field error in a
third field of said five successive fields is a local minimum among said
five successive fields.
29. The apparatus of claim 28 wherein said declaring a repeat field
comprises:
detecting a repeat field; and
confirming said repeat field occurring every five fields thereafter.
30. The apparatus of claim 28 wherein said generating field error
comprises:
selecting a subset of pixels in a field such that said subset excludes
pixels having subtitles;
obtaining the absolute value of the sum of the difference in pixel
intensity between said subset in a current field and said subset in a
field prior to the previous field.
31. The apparatus of claim 28 wherein said generating field error
comprises:
selecting a subset of pixels in a field such that said subset excludes
pixels having subtitles;
obtaining the sum of the absolute value of the difference in pixel
intensity between said subset in a current field and said subset in a
field prior to the previous field.
32. The apparatus of claim 27 wherein said examining successive video
fields to locate a repeat field comprises:
a method selecting a field of said interlaced video signal to process;
a method generating field error for each of five successive fields, wherein
said five successive fields comprise said selected field and four previous
fields thereafter;
a method declaring a repeat field at said selected field if said field
error in a third field of said five successive fields is less than a
predetermined threshold.
33. The apparatus of claim 27 further comprising:
a method determining if pixel motion is present in said interlaced video
signal when said first video signal did not originate from said film
source, wherein said pixel motion is determined for a selected pixel in a
selected field;
wherein said method of processing to remove pixel motion comprises
replacing said selected pixel with an average of a pixel directly above
and a pixel directly below said selected pixel in a previous field to said
selected field if said pixel motion is present.
34. The apparatus of claim 33 wherein said determining if pixel motion is
present comprises:
comparing the difference between pixels in said selected field with pixels
from a field prior to said previous field.
35. The apparatus of claim 27 further comprising:
a method scaling said progressive video signal to generate a desired video
data.
36. The apparatus of claim 35 further comprising:
means for converting said desired video data for output on analog devices.
37. A method for eliminating motion artifacts from an interlaced video
signal converted for progressive video display comprising:
receiving an interlaced video signal having one or more video frames, said
interlaced video signal hay lag an odd field and an even field in each of
said of one or more video frames;
determining if said interlaced video signal originates from a film source
by generating field errors for five successive video fields and
identifying the local minimum of said field errors as a repeat field
caused by a 3-2 pulldown conversion of said film source to said interlaced
video signal, said repeat field occurring every five fields thereafter;
generating a frame of progressive video signal having an odd component
comprising said odd field and an even component comprising said even field
for each of said one or more video frames of said interlaced video signal
by processing to remove pixel motion if said interlaced video signal did
not originate from said film source; and
generating said frame of progressive video signal having said odd component
and said even component for each of said one or more video frames of said
interlaced video signal if said interlaced video signal originates from
said film source, comprising:
starting a five field counter at location of said repeat field, said five
field counter counting from zero to four and then restarting;
generating said frame of said progressive video signal by using a previous
field and a field prior to said previous field of said interlaced video
signal for said odd component and said even component when said field
counter is zero, two, and three; and
generating said frame of said progressive video signal by using a current
field and said previous field of said interlaced video signal for said odd
component said even component when said field counter is one and four.
Description
BACKGROUND OF THE INVENTION
1. Field of the Invention
This invention relates to the field of video compensation. More
specifically the invention relates to detecting and correcting motion
artifacts in video source signals.
2. Background Art
In North America the video displayed across a normal television screen is
an interlaced video signal, which is a standard called NTSC (National
Television Standards Committee) video. This is not the same video
displayed across most computer screens since computer screens use mostly
non-interlaced display devices.
Interlaced video simply means that for each picture frame displayed on the
television screen, there are two video fields being displayed one after
the other. The first field is commonly known as the odd field, and the
second field as the even field. Since the interlaced video frame is
displayed at 30 frames (i.e. 60 fields) every second, the odd field is
displayed in the first one sixtieth (1/60) of a second while the even
field is displayed in the second one sixtieth of a second.
Each display monitor comprises a series of horizontal and vertical lines.
For example, the resolution of an NTSC television monitor is approximately
858 horizontal counts by 525 vertical lines. Actual resolution excluding
blanking lines is 720 by 480. In a television display, the odd field of
the interlaced video signal is displayed on the odd numbered (i.e. 1, 3,
5, . . . ) horizontal lines of the monitor and the even field is displayed
on the even numbered (i.e. 0, 2, 4, 6, . . . ) horizontal lines. Thus, at
brief instances of time, alternating lines of the television screen do not
have any video display (i.e. are blank). However, because the display rate
is faster than can be perceived by the human eye, a viewer is not able to
discern the blanked lines.
Video is a linear medium like audio, unlike photography or film. A film
camera captures the entire frame of a picture in a single instant. But
video was originally designed to be transmitted over the air. Video images
must be broken up and transmitted or recorded as a series of lines, one
after the other. At any given millisecond, the video image is actually
just a dot speeding across the face of the monitor.
One problem with NTSC is that it is an analog system. In non-analog systems
such as computer video, numbers represent colors and brightness. But with
analog television, the signal is just voltages, and voltages are affected
by wire length, connectors, heat, cold, videotape, and other conditions.
Digital data does not have such problematic characteristics. Thus, it
would be advantageous to store or transmit video signals in a digital
format.
Interlaced NTSC video must be converted to non-interlaced (i.e.
progressive) video for display on devices such as computer screens. The
conversion is generally performed in the digital domain therefore, the
NTSC video signal must first be converted from analog to digital and then
the odd and even fields are combined into one complete non-interlaced
video frame such that the complete frame is displayed in one scan of the
video signal.
Analog video inputs may be available in any of the different color models
such as the C-Video, S-Video, or YUV (or YIQ). A color model (also color
space) facilitates the specification of colors in some standard, generally
accepted way (e.g., RGB). In essence, a color model is specification of a
3-Dimensional coordinate system and a subspace within that system where
each color is represented by a single point.
The C-Video or Composite Video is a type of video signal in which all
information--the red, blue, and green signals (and sometimes audio signals
as well)--are mixed together. This is the type of signal used by
televisions in the United States. The S-Video, short for Super-Video, is a
technology for transmitting video signals over a cable by dividing the
video information into two separate signals: one for color (chrominance),
and the other for brightness (luminance). When sent to a television, this
produces sharper images than composite video, where the video information
is transmitted as a single signal over one wire. This is because
televisions are designed to display separate Luminance (Y) and Chrominance
(C) signals. The terms Y/C video and S-Video are used interchangeably.
The YUV or YIQ Color model is used in commercial color TV broadcasting. The
Y generally stands for intensity (luminance, brightness) and thus provides
all the information required by a monochrome television. The other two
components carry the color (chrominance) information. Each model component
may be represented in various bit depths. For example, the brightness
component may range from 1-bit (black and white), to over 8-bit (usual,
representing 256 values of gray) to 10- or 12-bit. Note that brightness,
luminance, and intensity are used interchangeably in this specification.
Whatever the color model of the input, the incoming video signal may need
to be converted to progressive video for display on non-interlaced
devices. Video signals originate from various sources. For example, a
video material may have originated from a film source, or may have been
recorded using an interlaced video camera. In recent years there has been
a proliferation of film material being converted to NTSC video for display
on regular television. For example, movies stored on videotape usually
originated from a film counterpart. Film data is shot at twenty-four
frames a second (24 frames/sec) while NTSC data is at 30 frames a second
(i.e. 60 fields/second) therefore the film data must be scaled in
frequency from 24 frames/second to the NTSC rate of 30 frames/second (i.e.
60 fields/sec). To achieve this, a method called 3-2 pulldown is employed.
Thus, 3-2 pulldown is a method for transferring film material that is at
24 frames per second to NTSC video at 30 frames per second. That is, 24
film frames in 30 video frames requires that four film frames be converted
to five video frames (i.e. 24 to 30 every second).
FIG. 1 is an illustration of the mechanics of 3-2 pulldown. In this
illustration, row 100 contains film frames f1-f7 that are mapped into row
106 comprising interlaced video frames v1-v8. Each interlaced video frame
comprises an odd and an even field shown in row 104. For example,
interlaced video frame v1 comprises interlaced video fields 1o and 1e,
interlaced video frame v2 comprises interlaced video fields 2o and 2e, and
so on for all the video frames up to v8. Row 102 represents the field
frame numbers that are mapped into the respective video fields. As shown
in row 102, film frame 1 (i.e. f1) is mapped into video fields 1o, 1e, and
2o; film frame 2 (i.e. f2) is mapped into video fields 2e and 3o; film
frame 3 (i.e. f3) is mapped into video fields 3e, 4o, and 4e; film frame 4
(i.e. f4) is mapped into video fields 5o and 5e. This process continues
whereby one film frame is mapped into three video fields, followed by the
second film frame being mapped into the next two video frames. This
three-two cycle repeats itself hence the process known as 3-2 pulldown.
Further, in this illustration of the 3-2 pulldown phenomenon, film frames
f1-f4 are mapped into video frames v1-v5. Film frames f1-f4 and video
frames v1-v5 must occur in the same 1/6.sup.th of a second to preserve the
length of the material being converted. As shown, film frame f1 is mapped
into the odd and even fields of video frame v1 and into the odd field of
video frame v2, and film frame f2 is mapped into the even field of video
frame v2 and into the odd field of video frame v3. This results in video
frame v2 having film frame f1 in its odd field and film frame f2 in its
even field, and video frame v3 having film frame f2 in its odd field and
film frame f3 in its even field. Thus video frames v2 and v3 are composed
of mixed film frames. The phenomenon known as field motion, illustrated by
a "Yes" in row 108, occurs in video frames with mixed film frames.
When viewed on an NTSC television, the video generated from the 3-2
pulldown is visually tolerable to the viewer because television displays a
single field at a time hence the video appears continuous. However, if the
NTSC data originating from film source is subsequently converted to
progressive video for display on a computer display, for example, a
problem known as "field motion" may occur. Field motion occurs because
each progressive video frame is displayed one at a time.
One method of generating progressive video material is to combine the odd
and even fields of an interlaced video material to generate a frame of the
progressive video material. Using a progressive material generated from
film material, for example, progressive video frame v1 comprises film
frame f1 in its odd and even lines. Progressive video frame v2 comprises
film frame f1 in its odd lines and film frame f2 in its even lines. If
film frames f1 and f2 are shot at different times and if an object has
moved during that time, the object may be at different locations on film
frames f1 and f2. Now, if the progressive video frame v2 is viewed in
still frame, the object will be distorted. This distortion is what is
known as "field motion". The distortion becomes more pronounced as the
video material is scaled-up to fit higher resolution display devices.
Video Scaling
Video scalers are employed to change the size of an original video signal
to fit a desired video output device. A scaler changes the size of an
image without changing its shape, for instance, when the image size does
not fit the display device. Therefore, the main benefit of a scaler is its
ability to change its output rate to match the abilities of a display
device. This is especially advantageous in the case of digital display
devices because digital display devices produce images on a fixed matrix
and in order for a digital display device to provide optimal light output,
the entire matrix should be used.
Since a scaler can scale the output both horizontally and vertically, it
can change the "aspect ratio" of an image. Aspect ratios are the
relationship of the horizontal dimension to the vertical dimension of a
rectangle. Thus, when included as part of a graphics switch, a scaler can
adjust horizontal and vertical size and positioning, for a variety of
video inputs. For example, in viewing screens, the aspect ratio for
standard TV is 4:3, or 1.33:1; HDTV is 16:9, or 1.78:1. Sometimes the ":1"
is implicit making TV=1.33 and HDTV=1.78. So, in a system with NTSC, PAL
or SECAM inputs and a HDTV type of display, a scaler can take the standard
NTSC video signal and convert it to a 16.times.9 HDTV output at various
resolutions (e.g. 480p, 720p, and 1080p) as required to fit the HDTV
display area exactly.
Scaling is often referred to as "scaling down" or "scaling up." An example
of "scaling down" is when a 640.times.480 resolution TV image is scaled
for display as a smaller picture on the same screen, so that multiple
pictures can be shown at the same time (e.g. as a picture-in-picture or
"PIP"). Scaling the original image down to a resolution of 320.times.240
(or 1/4 of the original size) allows four input TV resolution pictures to
be shown on the same output TV screen at the same time. An example of
"scaling up" is when a lower resolution image (e.g. 800.times.600=480,000
pixels) is scaled for display on a higher resolution
(1024.times.768=786,432 pixels) device. Note that the number of pixels is
the product of the two resolution numbers (i.e. number of
pixels=horizontal resolution.times.vertical resolution). Thus, when
scaling up, pixels must be created by some method. There are many
different methods for image scaling, and some produce better results than
others.
A scan converter is a device that changes the scan rate of a source video
signal to fit the needs of a display device. For instance, a "video
converter" or "TV converter" converts computer-video to NTSC (TV), or NTSC
to computer-video. Although the concept seems simple, scan converters use
complex technology to achieve signal conversion because computer signals
and television signals differ significantly. As a result, a video signal
that has a particular horizontal and vertical frequency refresh rate or
resolution must be converted to another resolution or horizontal and
vertical frequency refresh rate. For instance, it requires a good deal of
signal processing to scan convert or "scale" a 15.75 KHz NTSC standard TV
video input (e.g. 640.times.480) for output as 1024.times.768 lines of
resolution for a computer monitor or large screen projector because the
input resolution must be enhanced or added to in order to provide the
increased capability or output resolution of the monitor or projector.
Because enhancing or adding pixels to the output involves reading out more
frames of video than what is being read in, many scan converters use a
frame buffer or frame memory to store each incoming input frame. Once
stored, the incoming frame can be read out repeatedly to add more frames
and/or pixels.
Similarly, a scan doubler (also called "line doubler") is a device used to
change composite interlaced video to non-interlaced component video,
thereby increasing brightness and picture quality. Scan doubling is the
process of making the scan lines less visible by doubling the number of
lines and filling in the blank spaces. Also called "line-doubling". For
example, a scan doubler can be used to convert an interlaced, TV signal to
a non-interlaced, computer video signal. A line doubler or quadruplet is
typically very useful for displaying images on TV video or TFT flat panel
screens.
Because of the problems exigent in current conversion systems, there is a
need for a system that enhances or improves the quality of video images by
correcting the effects caused by converting the video signal from one type
to another. For instance, current systems lack an effective way to
eliminate field motion from interlaced video material during the
conversion to progressive video.
SUMMARY OF THE INVENTION
The invention defines a method and apparatus for detecting and correcting
motion artifacts in interlaced video signal converted for progressive
video display. An embodiment of the invention provides a method and
apparatus for enhancing or improving the quality of video images by
correcting the effects caused by converting the video signal from one type
to another. For instance, an embodiment of the invention entails
determining whether interlaced video material originated from a film
source, thereby having been converted to video using a process known as
3-2 pulldown, and then correcting the interlaced video material to
counteract the effect of the 3-2 pulldown. If the video material is
concluded to originate from video because of inadequate confirmation of
the 3-2 pulldown phenomenon, a check is made for the presence of "pixel
motion" so that other necessary corrections may be applied. After
appropriate corrections are applied, the resulting de-interlaced video
material may be additionally processed using processes such as video
scaling to generate a desired output resolution.
A video field is compared to the field prior to the previous field to
generate field error used in determining either the origination of the
video material from a film source (i.e. 3-2 pulldown process) or the
existence of "pixel motion". Field errors are generated for five
consecutive fields and a local minimum repeating every five fields
indicate the origination of the video material from film source using the
3-2 pulldown process.
In one embodiment, upon confirmation of 3-2 pulldown, the video material is
subsequently modified to correct for the mixing of two film frames into
one interlaced video frame by assuring that the two fields of the
de-interlaced video frame contain data from the same film frame. Where the
video material did not originate from a film source, but pixel motion is
detected, the pixel motion is smoothed out by an averaging method. The odd
and even fields of the resulting video data are subsequently combined to
form a progressive video material.
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1 is an illustration of the mechanics of 3-2 pulldown process where
film material is converted to interlaced video material.
FIG. 2 is a flow diagram of a video conversion process according to an
embodiment of the present invention.
FIG. 3 is a flow diagram of the video processing according to an embodiment
of the present invention.
FIG. 4 is an illustration of the concept of using five consecutive fields
to determine whether video material originated from film source according
to an embodiment of the present invention.
FIG. 5 is a block diagram illustrating the apparatus of the video
conversion process according to an embodiment of the present invention.
FIG. 6 is an illustration of the processing that occurs during the film
mode flag generation and operation in the film mode according to an
embodiment of the present invention.
FIG. 7 is an illustration of the selection of the field area used for
determination of field differences in accordance with an embodiment of the
present invention.
FIG. 8 is a timing diagram showing the computation of the field error in
accordance with an embodiment of the present invention.
FIG. 9 is a block diagram of one embodiment of a computer system capable of
providing a suitable execution environment for an embodiment of the
invention.
FIG. 10 is a flow diagram illustrating the processing of 3-2 pulldown video
in accordance with an embodiment of the invention.
DETAILED DESCRIPTION OF THE INVENTION
An embodiment of the invention comprises a method and apparatus for
detecting and correcting motion artifacts in interlaced video signal
converted for progressive video display. In the following description,
numerous specific details are set forth to provide a more thorough
description of embodiments of the invention. It will be apparent, however,
to one skilled in the art, that the invention may be practiced without
these specific details. In other instances, well known features have not
been described in detail so as not to obscure the invention.
An embodiment of the invention provides a method and apparatus for
enhancing or improving the quality of video images by correcting the
effects caused by converting the video signal from one type to another.
For instance, one embodiment of the invention eliminates field motion from
interlaced video material during conversion to progressive video. An
embodiment of the present invention entails determining whether the
interlaced video material originated from a film source and was therefore
converted to video using a process known as 3-2 pulldown. If the film
source was converted to video using the 3-2 pulldown technique, the
invention corrects the effects of the 3-2 pulldown. If the video material
is not a result of the 3-2 pulldown process, a check is made for the
presence of "pixel motion" so that other corrections may be applied. After
appropriate corrections are applied, the resulting de-interlaced video
material is unchanged in both length and rate. Additional processing, such
as video scaling to a desired output resolution, may subsequently be
performed using the de-interlaced video material.
Because determination of 3-2 pulldown or "field motion" requires comparing
different video fields to determine repeat fields, incoming video signals
are digitized and stored in memory buffers. One way of finding repeat
fields is to compare each field to the field prior to the previous field.
Every other field in interlaced video material is of the same type (i.e.
odd or even) and when two adjacent fields of the same type are identical
(e.g. 1 odd-1 even-1 odd), the video material most likely originated from
the same film frame. Identical adjacent fields of the same type occur
every fifth field in a 3-2 pulldown video.
In one embodiment, corrections to the video material are applied in real
time while the resulting progressive video is actively displayed.
Therefore, a reliable algorithm to detect, confirm, and correct any video
anomaly is employed during the display process. For example, one or more
embodiment uses the error in five successive fields to detect the presence
of 3-2 pulldown. After detection, the 3-2 pulldown must be confirmed for
at least one additional processing cycle. Thus, since the repeat film
frame in a 3-2 pulldown occurs every five video fields, the fifth field
following the detection of 3-2 pulldown should show a repeat field to
declare confirmation. Different confirmation techniques may be employed in
other embodiments such as: two out of three detections, for example, or
even three out of three. The invention also contemplates the use of other
confirmation combinations so long such combinations reliably confirm the
presence of 3-2 pulldown.
Once 3-2 pulldown is detected and confirmed, correction to the video
material is performed in real time. Confirmation o