Title: Adaptive method of encoding and decoding a series of pictures by transformation, and devices for implementing this method
Abstract: A method and a device for coding and decoding a sequence of images or pictures is disclosed. One exemplary embodiment disclosed codes by dividing each picture into blocks of picture elements. Each element of a block being represented by a digital value. Two types of coding are utilized in order to reduce the amount of data; inter-coding, which takes into account a corresponding block in a previous picture and intra coding, which is independent from a previous picture. Blocks being thus coded so that a further reduction of data is obtained by transmitting high spatial frequencies with less weight than low spatial frequencies. This is accomplished with the use of weighting coefficients. Weighting coefficients are variable as a function of the quantity of information to be transmitted.
Patent Number: 7,020,204 Issued on 03/28/2006 to Auvray,   et al.
| Inventors:
|
Auvray; Eric (Orvault, FR);
Perron; Claude (Rennes, FR);
Ferre; Alain (Cesson Sevigne, FR)
|
| Assignee:
|
Thomson Licensing (Boulogne-Billancourt, FR)
|
| Appl. No.:
|
071352 |
| Filed:
|
February 8, 2002 |
Foreign Application Priority Data
| Current U.S. Class: |
375/240.25; 375/240.18; 375/240.24 |
| Current Intern'l Class: |
H04N 7/12 (20060101) |
| Field of Search: |
375/24012,240.13,240.24,240.25
348/400.1
382/236,238,248,250
386/109,111
|
References Cited [Referenced By]
U.S. Patent Documents
| 3761613 | Sep., 1973 | Limb.
| |
| 4196448 | Apr., 1980 | Whitehouse et al.
| |
| 4217609 | Aug., 1980 | Hatori et al.
| |
| 4245248 | Jan., 1981 | Netravali et al.
| |
| 4302775 | Nov., 1981 | Widergren et al.
| |
| 4394774 | Jul., 1983 | Widergren et al.
| |
| 4447886 | May., 1984 | Meeker.
| |
| 4541012 | Sep., 1985 | Tescher.
| |
| 4581638 | Apr., 1986 | Chiariglione et al.
| |
| 4583114 | Apr., 1986 | Catros.
| |
| 4672427 | Jun., 1987 | Rzeszewski.
| |
| 4672441 | Jun., 1987 | Hoelzlwimmer et al.
| |
| 4707738 | Nov., 1987 | Ferre et al.
| |
| 4734767 | Mar., 1988 | Kaneko et al.
| |
| 4901075 | Feb., 1990 | Vogel.
| |
| 4918523 | Apr., 1990 | Simon et al.
| |
| 4969055 | Nov., 1990 | Oberjatzas et al.
| |
| Foreign Patent Documents |
| 0 084 270 | Jul., 1983 | EP.
| |
| 0 189 703 | Aug., 1986 | EP.
| |
| 2 589 020 | Apr., 1987 | FR.
| |
| 60-194875 | Oct., 1985 | JP.
| |
| 62-272632 | Nov., 1987 | JP.
| |
Other References
Gakkaishi, Terebijon, International Standardization of Colour Still Picture Coding,
edited by Hiroshi Yasuda, The Journal of the Institute of the Television Engineers
of Japan, vol. 41, No. 9 (Ser. No. 469), Sep. 20, 1987 pp. 815-820, translation attached.
Kaneko, Masahide et al., "Selection of Transformed Coefficients to be Quantized
in Hybrid Coding Method", edited by Masahide Kaneko et al., National Convention
Record (1987), The Institute of Electronics, Information and Communication Engineers
(Part 5), p. 33, translation attached.
H. Bacchi et al., "Real-Time Orthogonal Transformation Of Colour-Television Pictures",
Philips Technical Review, vol. 38, No. 4/5. 1978/1979, pp. 119-129.
A. G. Tescher, "Adaptive Transform Coding Of Color Images At Low Rates", National
Telecommunications Conference, Houston, Nov. 30-Dec. 4, 1980, vol. 2, pp. 36.3.1-36.3.4,
IEEE, New York.
|
Primary Examiner: Le; Vu
Attorney, Agent or Firm: Tripoli; Joseph S., Laks; Joseph J., Liao; Frank Y.
Parent Case Text
CROSS REFERENCE TO RELATED APPLICATIONS
This application is a divisional of application Ser. No. 09/978,376, filed Oct.
17, 2001, now U.S. Pat. No. 6,563,875, which is a continuation of application Ser.
No. 07/408,515, filed Nov. 30, 1995 now abandoned claiming the benefit and priority
of PCT International Application No. PCT/FR88/00649, filed Dec. 30, 1988, and French
Application No. 87 18371, filed Dec. 30, 1987.
Claims
The invention claimed is:
1. A method for decoding a sequence of pictures coded in such a way that a picture
is divided into blocks of picture elements, each block being represented by corresponding
luminance and chrominance blocks and that an inter coding takes into account a
previous picture and an intra coding is independent from a previous picture, and
coding using weighting coefficients, high spatial frequencies being less weighted
than low spatial frequencies, and using the same weighting coefficients for coding
of the corresponding luminance and the chrominance blocks; wherein same coding
type (inter or intra) is specified at a block level and the same coding type is
applied to the corresponding luminance and chrominance blocks; and a picture separator
word is inserted between encoded data corresponding to two consecutive pictures,
each picture separator word comprises a pattern which cannot be imitated by licit
concatenations of encoded data.
Description
The invention relates to an adaptive method of encoding and decoding a series
of pictures by transformation, and devices for implementing this method. The object
of such a method is to reduce the quantity of information to be transmitted, or
to be stored, when pictures have been digitized. It is applicable, for example,
to digital video transmission systems or to digital video recorders.
It is known to encode a digitized picture by using a two-dimensional transformation
of the cosine or Fourier, or Hadamard, or Haar, or Karhunen-Loeve type. Such an
encoding consists in: dividing each picture into blocks of picture elements, each
picture element being represented by a digital value which is the value of its
brightness or of a colour difference: applying the transformation to each block
in order to obtain a matrix of values called the transformation coefficients of
the block; and in transmitting these transformation coefficients in an encoded
form, for example using a Huffmann code. The decoding then consists in: decoding
the Huffmann code words in order to obtain the transformation coefficients; then
in restoring the digital values representing each picture element by applying,
to the transformation coefficients corresponding to a block of picture elements,
the two-dimensional transformation which is the inverse of that used for the encoding.
The transformations used in practice are transformations for which there exists
fast algorithms, for example the cosine transformation.
The French Patent Application 2,575,351 describes an adaptive method of encoding
and decoding consisting in:
dividing each picture into blocks of picture elements;
applying the cosine transformation to each block, the latter being represented
by a block of brightness values in order to obtain a block of transformation coefficient;
determining, for each block, if it represents a scene with much movement
or little movement;
transmitting the value of the transformation coefficients of the block
if the latter represents a scene with much movement, or transmitting the differences
in the value of these coefficients with respect to the coefficients of the similar
block in the previous picture, if the block represents a scene with little movement;
transmitting an information indicating the type of encoding used for
each block, these two types of encoding being called respectively intra-picture
encoding and inter-picture encoding. The coefficients or differences of coefficients
are transmitted in the form of Huffmann code words.
According to this known method, the decoding consists, before applying
the inverse transformation, in determining a value of the transformation coefficients
of each block representing a scene with little movement, by adding the difference
in value of each of its coefficients respectively to the value of the coefficients
of the similar block in the previous picture.
According to this known method, the encoding furthermore consists in applying
a weighting to the values of the coefficients or to the values of the differences
of coefficients, with a greater weight for the coefficients or the differences
of coefficients corresponding to the low spatial frequencies of the picture, with
respect to the coefficients or to the differences of the coefficients corresponding
to the high spatial frequencies of the picture; and in quantifying according to
a linear scale the weighted coefficients and differences of coefficients. The quantification
step is variable according to the quantity of information to be transmitted. This
is equivalent to multiplying all of the coefficients or all the differences of
transformation coefficients of a block by a same coefficient called the quantification
coefficient which is variable according to the quantity of information to be transmitted
for the blocks of picture elements encoded before the block concerned; and in retaining
only the whole part of the result of the multiplication.
The information to be transmitted is stored in a buffer memory enabling a transmission
at a constant rate. A regulating device supplies a value of the quantification
coefficient which continuously diminishes while the buffer memory is filling and
which continuously increases while the buffer memory is emptying.
Naturally, the decoding furthermore consists in multiplying each transmitted
coefficient value or each transmitted difference of coefficients value, by a coefficient
equal to the inverse of the weighting coefficient used for the encoding; and then
in multiplying it by a coefficient equal to the inverse of the quantification coefficient
used for the encoding.
When a series of pictures represents a scene containing much movement, the quantity
of information to be transmitted is high, and consequently the quantification coefficient
is small in order to reduce the amplitude of the values of the transformation coefficients
or differences of transformation coefficients to be transmitted. Furthermore, the
weighting coefficients give greater weight to the transformation coefficients corresponding
to the low spatial frequencies of the picture in order to transmit the essential
information of the picture while sacrificing less essential information which corresponds
to the high spatial frequencies of the picture.
When the series of pictures represents a scene with little movement or a static
scene, the encoding of each block is of the inter-picture type in order to exploit
the correlation existing between these successive pictures. From one picture to
the next, the values of the differences of transformation coefficients of similar
blocks have a decreasing amplitude and the quantity of information to be transmitted
tends to reduce. The regulation then reacts by increasing the quantification coefficient.
On the other hand, the information remaining to be transmitted no longer relates
to the low spatial frequencies of the picture as it has been favoured by the weighting
and has therefore been transmitted. The information remaining to be transmitted
relates only to the high spatial frequencies of the picture and the latter are
then transmitted with a large amount of information. After a time interval corresponding
to several pictures, the totality of the information representing a static scene
is then transmitted and enables the reconstruction of the scene with very good fidelity.
For the encoding and decoding of colour television pictures, the previously mentioned
document suggests processing in parallel three series of digital values corresponding
to a brightness signal and to two colour difference signals respectively.
This known method has two disadvantages: the fact of the parallel processing
of these three series of digital values leads to the use of three buffer memories
which must restore the encoded information with data rates having constant ratios
because the transmission channel has a constant data rate. Now, the information
data rates corresponding to a brightness signal and to two colour difference signals
have extremely variable ratios because the saturation of the colours is very variable
and can even be zero in the case of pictures containing only whites, greys and
blacks. The fact of imposing a constant ratio between these three information data
rates leads in practice to uselessly increasing the quantity of information transmitted,
or in sacrificing a portion of the information corresponding to colour differences,
which is harmful to the fidelity of the reproduction.
Another disadvantage results from the regulation used in this method. When,
in a same picture, there is a succession of blocks encoded by an inter-picture
encoding, the quantity of information to be transmitted being small, the regulation
reacts by increasing the quantification coefficient and tends to maintain a filling
of the buffer memory. If an isolated block, or several blocks are to then be encoded
by an intra-picture encoding, because they correspond to a limited area which is
in motion, it is suddenly necessary to transmit large amounts of information. The
buffer memory being maintained practically full, the regulation can only react
by sacrificing a large portion of the information to be transmitted, i.e. by suddenly
reducing the quantification coefficient when the buffer memory approaches saturation.
In such a case, the blocks of picture elements encoded by the inter-picture encoding
are restored with excellent fidelity while the adjacent blocks, encoded by an intra-picture
encoding are restored with mediocre fidelity. The difference in quality is then
very noticeable because these two types of blocks are adjacent in the same picture.
The purpose of the invention is to overcome these two disadvantages of the known
method. The object of the invention is an adaptive method of encoding consisting
in particular in storing in a same buffer memory the information to be transmitted
corresponding to the values of brightness and to the values of the two colour difference
signals, and consisting in using weighting coefficients and identical quantification
coefficients, except for the application of a constant, for the transformation
coefficients or the differences of transformation coefficients corresponding to
these three types of signals.
According to another feature, the method according to the invention consists
in using weighting coefficients which, in addition to giving a greater weight to
information corresponding to the low spatial frequencies of the picture, are also
variable according to the quantity of information to be transmitted, in order to
further reduce the weight given to the information corresponding to the high spatial
frequencies of the picture when the filling of the buffer memory increases and
approaches the maximum.
According to another feature of the method according to the invention,
the quantification coefficient is variable as a function of the filling rate of
the buffer memory, but with a discontinuity corresponding to a fixed filling threshold,
in order to be constant below this filling threshold and in order to increase when
the filling rises above this threshold.
According to the invention, an adaptive method of encoding and decoding
of a series of pictures by transformation, the encoding consisting in:
dividing each picture into blocks of picture elements, each block being
represented by a block of brightness values, a block of blue colour difference
values and a block of red colour difference values;
applying a two-dimensional transformation to each block of values in order
to obtain a block of transformation coefficients of the block of values concerned;
transmitting, for each block of values, either the value of transformation
coefficients of the block, or the difference in value of these transformation coefficients,
with respect to the value of the transformation coefficients of a similar block
in the picture preceding the picture being encoded, in order to minimize the quantity
of information to be transmitted for the block in question, the encoding being
called intra-picture or inter-picture respectively;
furthermore consisting in multiplying the transformation coefficients
and the differences of transformation coefficients, before they are transmitted,
by a coefficient called the weighting coefficient, favouring the low spatial frequencies
of the pictures; and also in multiplying them by a coefficient called the quantification
coefficient which is variable as a function of the quantity of information to be transmitted;
the decoding consisting for each block in:
multiplying each transmitted value of transformation coefficient or each
transmitted value of transformation coefficient difference, by a coefficient equal
to the inverse of the weighting coefficient used for the encoding and by a coefficient
equal to the inverse of the quantification coefficient used for the encoding;
adding, to the value of each difference of transformation coefficients, the
value of a transformation coefficient, similar to the coefficient in question in
a similar block to the block in question and belonging to the picture preceding
the picture being decoded;
applying to each transformation coefficient a transformation which is the
inverse of the transformation applied for the encoding, in order to obtain a block
of values representing a portion of the decoded picture;
is characterized in that it furthermore consists in regulating the data rate
of
the transmitted information, in storing in a same buffer memory, the information
to be transmitted corresponding to the 3 types of blocks of values, and in multiplying
the transformation coefficients and the differences of transformation coefficients
of the 3 types of blocks by a same variable weighting coefficient and by a same
variable quantification coefficient, except for the application of a constant multiplication factor.
The invention will be better understood and other characteristics will appear
with the help of the following description and of the accompanying figures in which:
FIG. 1 shows the order of scanning the transformation coefficients or the differences
of transformation coefficients of a block of Picture elements, in one example of
implementation of the method according to the invention;
FIG. 2 shows a graph illustrating a weighting performed in this example of implementation;
FIG. 3 shows a graph illustrating a quantification performed in this example
of implementation;
FIGS. 4 to 6 illustrate the regulation of the data rate of the transmitted
information in this example of implementation;
FIGS. 7 and 8 illustrate the implementation of a variant of the method according
to the invention;
FIG. 9 shows the synchronization signals of an encoder and a decoder, in an
example of implementation of the method according to the invention;
FIGS. 10 and 11 show the block diagram of an embodiment of an encoding device
for the implementation of the method according to the invention;
FIGS. 12 and 13 show the block diagram of an embodiment of a decoding device
for the implementation of the method according to the invention.
In one example of implementation, the series of pictures is constituted by a
series
of colour television frames, sampled and digitized. Each picture element is represented
by a brightness value, a red colour difference value and a blue colour difference
value, each of these values having eight bits. The brightness is sampled at a frequency
of 10.125 MHz while each of the colour difference signals is sampled at a frequency
of 5.0625 MHz. The data rate of the encoded information is constant and in the
order to 10 Mbits per second.
Each television picture to be encoded is constituted by two interlaced frames
analysed in a conventional way by a television camera. Each frame is encoded separately.
In this example of implementation the transformation used is the cosine transformation.
It is applied to blocks of 16×16 brightness values, blocks of 16×8 red
colour difference values, and blocks of 16×8 blue colour difference values.
Any other known two-dimensional transformation can be used for the implementation
of the method according to the invention. The transformation is applied in parallel
to each of these three types of block. For each block of values it supplies a matrix
of values called transformation coefficients of the block of values concerned.
The block of transformation coefficients has dimensions identical to those of the
block to be transformed. The transformation coefficients have real values.
The transformation coefficients of the brightness values f(i,j) are computed
according to the formula:
##EQU1##
where c(u)=1/√{square root over (2)} if u=0
The transformation coefficients of the colour difference values f′(i,j)
are computed according to the formula:
##EQU2##
i and j are respectively the line index and the column index in the block of
values
to be encoded; u and v are respectively the line and column indices of the transformation
coefficients in the block of these coefficient. The transformation coefficient
situated in the first column and on the first line is equal to twice the average
value of the coefficients of a block of transformation coefficients. It has a value
which is always positive. This value must be encoded with the greatest possible
accuracy since the slightest error in this transformation coefficient results in
a very visible difference between a block of picture elements and the adjacent blocks.
The other coefficients of a transformation block correspond to spatial frequencies
of the picture which are increasing as u and v increase. For the highest values
of u and v, the transformation coefficients are generally zero. In the series,
the coefficients of a block of transformation coefficients are considered according
to a scanning order which is graphically represented in FIG. 1 and which corresponds
to increasing values of the sum u
2+v
2. The scanning path
of the transformation coefficients is chosen in such a way as to optimize the compression
ratio for the type of picture which is to be encoded, according to their statistical characteristics.
The implementation of the method according to the invention then consists in
performing, in parallel, an encoding called the inter-picture encoding and an encoding
called the intra-picture encoding for each type of value to be encoded: brightness
value, blue colour difference value and red colour difference value. The inter-picture
encoding consists in computing the difference in value between the transformation
coefficients of a block of picture elements in question, respectively with regard
to the transformation coefficients of a similar block to the block in question
in the picture preceding the picture being encoded. The intra-picture encoding
consists in directly using the value of the transformation coefficients of the
block. In both cases, the method of encoding then consists in applying a weighting,
a quantification and then a Huffmann encoding.
In general, the intra-picture encoding of a block representing a portion of a
picture in motion requires a larger amount of information than the inter-picture
encoding. Conversely, the inter-picture encoding of a block representing a static
portion of picture generally requires less information than the intra-picture encoding.
The choice of the type of encoding is common for all three types of signal to be
encoded. In this example of implementation, the choice of the type of encoding
consists in precisely determining the quantities of information respectively necessary
in both cases for a same block of digital values to be encoded. Each quantity of
information is computed by counting the number of encoded data bits supplied by
the following series of operations: weighting, quantification, Huffmann encoding.
The method then consists in transmitting the data supplied by the encoding requiring
the smallest amount of information.
The weighting enables the exploitation of the fact that a suppression of information
encoding certain transformation coefficients of a block of picture elements does
not give rise to much degradation in the decoded picture. The coefficients corresponding
to the low spatial frequencies of the picture are more sensitive to suppressions
of information than the coefficients corresponding to the high spatial frequencies
of the picture. The weighting is such that the coefficients corresponding to the
low spatial frequencies are favoured. It consists in multiplying the value of the
transformation coefficients or of the differences of the transformation coefficients
of a block by a weighting coefficient which is given, for the brightness, by the
following formula:
##EQU3##
where u and v are respectively the index of the column and of the line of the
coefficient or of the difference of coefficients to which the weighting applies;
where R is a constant which depends on the size of the block and on the sampling
frequency of the picture, its value being 1.4 for a sampling frequency of 10.125
MHz and for a block of size 16×16; where Nor is a constant parameter but depending
on R which is given by the following formula:
##EQU4##
Nor=0.42 for a sampling frequency of 10.125 MHz;
and where Pon is a variable parameter which defines the severity of the weighting.
Its value depends on the filling of the buffer memory storing the encoded information
to be transmitted, corresponding to the three types of signal representing the
picture elements. This information is the information relating to the blocks preceding
the block of picture elements being encoded. The number of bits in question is
that obtained after the Huffmann encoding of the non-zero values, the encoding
by sequences of zero values, and after insertion of data separating words. The
severity of the weighting is an increasing function of the filling of the buffer
memory, in order to act against this filling.
In this example of implementation, the capacity of the buffer memory is 64 kilobits.
The value of the parameter Pon, and the weighting ratio obtained, between the high
and the low spatial frequencies of the picture are given in the following table:
| |
|
| |
|
|
Value obtained |
| |
|
|
for the weighting |
| |
Filling of buffer |
Value of the Pon |
ratio high/low |
| |
memory |
parameter |
frequencies |
| |
|
| |
| |
64 to 48 Kb |
18 |
5 |
| |
48 to 40 Kb |
18.5 |
4.5 |
| |
40 to 32 Kb |
19 |
4 |
| |
32 to 24 Kb |
20 |
3.5 |
| |
24 to 16 Kb |
22 |
3 |
| |
16 to 8 Kb |
24 |
2.5 |
| |
8 to 0 Kb |
27 |
2 |
| |
|
The value of the transformation coefficients or of the differences of the transformation
coefficients for the colour difference signals, is weighted by a coefficient given
by the formula:
##EQU5##
where R′ is a constant which depends on the size of the block and on
the sampling frequency for the colour difference signal and which is equal to 0.7
for blocks of 16×8 and for a sampling frequency of 5.0625 MHz; and where Nor′
is a constant which is given by the following formula:
##EQU6##
Nor′=0.59 for a sampling frequency of 5.0625 MHz.
The weighting coefficient P
ch(u, v) is also a function of the filling
of the buffer memory, by means of the variable Pon, in order to participate in
the process of regulation of the data rate of the transmitted encoded information.
The weighting coefficient is the same for the data encoded by the inter-picture
encoding and for that encoded by the intra-picture encoding.
The method of regulation furthermore consists in multiplying the value of the
transformation coefficient or of the differences of transformation coefficients
of a block by a quantification coefficient which is a function of the filling of
the buffer memory, the latter containing the encoded data corresponding to the
blocks preceding the block being encoded.
The quantification operation is performed in parallel on the transformation coefficients
obtained by the intra-picture encoding and on the differences of transformation
coefficients obtained by the inter-picture encoding, after the weighting operation.
For a given block of picture elements, all of the transformation coefficients and
all of the differences of transformation coefficients corresponding to the brightness
are multiplied by the same quantification coefficient value. All of the transformation
coefficients and all of the differences of transformation coefficients corresponding
to the two colour difference signals are multiplied by a weighting coefficient
which has the same value as that corresponding to the brightness, except for the
application of a constant multiplication factor. This constant is equal to 1.41
in order to compensate for a constant multiplication factor introduced during the
computation of the cosine transforms and which is slightly different for blocks
of different sizes, as is the case for the brightness on the one hand and for the
colour differences on the other hand.
The quantification coefficient is constant for a filling E
b of the
buffer memory, less than a threshold value; it is exponentially decreasing when
the filling E
b is above this threshold value. In this example, in which
the capacity of the buffer memory is equal to 64000 bits, the filling threshold
value is taken as equal to 56000 bits. For the brightness, the quantification coefficient
is given by the following formula:
##EQU7##
For the colour difference signals, the quantification coefficient is given by
the following formula:
##EQU8##
where the constants Nor and Nor′ have the previously defined values.
The values of the transformation coefficients or of the differences of transformation
coefficients are truncated after the weighting and the quantification in order
to round them to the closest whole value.
FIG. 2 shows the graph of the ratio between the weighting coefficient applied
to the high spatial frequencies and the weighting coefficient applied to the low
spatial frequencies of the picture, for the brightness, as a function of the fitting
E
b of the buffer memory. It increases in steps as the filling varies
from 0 to 64 Kbits.
FIG. 3 shows the graph of the quantification coefficient N
lum, corresponding
to the brightness, as a function of the fitting E
b. It can clearly be
seen on this graph that the quantification coefficient is constant for most of
the filling values and that it decreases very rapidly when the filling is close
to its maximum. The relationship between the quantification coefficient and the
filling, in the method according to the invention, is therefore clearly different
from those used conventionally and which vary continuously. A conventional quantification
coefficient does not include such a horizontal level but decreases regularly as
a function of the filling.
FIGS. 4,
5 and
6 illustrate the regulation process, in the case
in which three successive pictures represent a static scene. Each of these figures
shows a graph in which the quantity of information actually transmitted is plotted
vertically and in which the rank of the transformation coefficients or of the differences
of transformation coefficients of each of these pictures is plotted horizontally,
this rank being determined on the scanning path shown in FIG. 1. The quantities
of information concerned are those obtained after the weighting, the quantification
and the Huffmann encoding.
However, the quantity of information corresponding to the first transformation
coefficient, i.e. having 0 as a line index and as a column index, is not shown
in FIGS. 4 to 6 as it is not subject to the data rate regulation. The first coefficient
is not weighted and is not standardized in order to avoid showing visible discontinuities
between the blocks on the restored picture. FIGS. 4 to 6 therefore show the quantity
of information corresponding only to the other transformation coefficients or differences
of transformation coefficients.
The first picture of these three successive pictures is assumed to be different
from the pictures having possibly preceded it. The transformation coefficients
of this first picture are therefore all encoded by an intra-picture encoding necessitating
a large amount of information, which is particularly distributed over the transformation
coefficients having a low rank. In addition, the weighting has the effect of also
favouring the coefficients of low rank. And, finally, certain coefficients of high
rank are rounded to zero during the operation consisting in rounding to the closest
whole value. Everything happens as if there were a threshold at +0.5 and -0.5.
For all of these reasons, in FIG. 4, the coefficients of high rank are not transmitted
by any quantity of information. In particular, many coefficients of high rank are
rounded to the zero value because they are less than 0.5 in absolute value. The
number of coefficients or differences of coefficients rounded to zero increases
when there is a reduction in the quantification coefficient resulting from a large
amount of filling of the buffer memory. The threshold effect combined with the
quantification therefore tends also to suppress information corresponding to high
spatial frequencies.
FIG. 5 corresponds to a second picture which immediately follows the first one
and which is identical to it. The transformation coefficients of the second picture
are therefore theoretically strictly identical to those of the first picture. They
will be encoded by an inter-picture encoding in order to exploit the correlation
between the first and second pictures. The inter-picture encoding is performed
by computing the difference between the transformation coefficients of the second
picture and the transformation coefficients of the first picture after having submitted
the latter to an encoding and then to a decoding in order to subtract a same value
during the encoding step and during the decoding step.
The weighting and quantification operations suppress information and therefore
cause encoding errors which result in a non-zero difference between the transformation
coefficients before encoding and the transformation coefficients after encoding
followed by decoding. There are therefore non-zero differences between the coefficients
encoded and then decoded for the first picture and the coefficients which will
be encoded for the second picture. These differences are in particular due to the
fact that the weighting coefficient and the quantification coefficient vary from
one picture to another.
FIG. 5 shows the quantity of information corresponding to these differences
of coefficients, these differences having undergone a weighting, a quantification
and a Huffmann encoding. The quantity of information constituted by these differences
is lower than the quantity of information corresponding to the values of the transformation
coefficients of the first picture for several reasons. Firstly, because the differences
of coefficients have low values, since the first picture and the second picture
are identical. This quantity of information corresponds above all to the average
frequencies and to the high frequencies, i.e. to the average ranks and to the high
ranks, because the weighting and quantification performed during the encoding of
the first picture have sacrificed the information corresponding to the average
ranks and to the high ranks. This information will enable the addition of details
in the second restored picture. The encoding of all of the blocks of the second
picture being an inter-picture encoding, using the correlation with the first picture,
the quantity of information to be transmitted diminishes the buffer memory empties
in consequence and the regulation process causes an increase in the quantification
coefficient and then its maintenance at the constant value: 1.
The increase in the quantification coefficient increases the amplitude of the
differences of transformation coefficients and therefore tends to increase the
quantity of information to be transmitted for the coefficients of average and high
ranks, and therefore tends to fill the buffer memory. However, the levelling of
the quantification coefficients slows down this tendency. Furthermore, the information
to be transmitted corresponds in particular to the average and high spatial frequencies
since the information corresponding to the low spatial frequencies has been transmitted
to a large degree during the encoding of the first picture. The weighting acts
progressively against the high frequencies as the buffer memory becomes filled.
Therefore the weighting itself also tends to reduce the quantity of information
transmitted for the second picture. Finally, this quantity if distinctly less than
that transmitted for the first picture.
The encoding of a third picture identical to the two previous ones is also an
inter-picture encoding and only has to transmit information corresponding to high
spatial frequencies which have not been able to be transmitted during the encoding
of the first and second pictures. This information will enable the addition of
fine details to the restored third picture. Thus when there is a series of static
pictures, the restored pictures rapidly achieve a very good fidelity.
FIG. 6 shows the quantity of information to be transmitted for the third picture,
as a function of the rank of the coefficients. It should be noted that this quantity
of information corresponds in particular to the very high spatial frequencies and
that it is generally smaller with respect to the quantity of information to be
transmitted for the encoding of the second picture and for the encoding of the
first picture, because the weighting very much acts against the very high frequencies,
even though the buffer memory is beginning to empty. The severity of the quantification
and the weighting is chosen such that the regulation does not tend to maintain
the filling at a constant level, but tends to reduce the filling during each inter-picture encoding.
In the most general case, each picture comprises static areas and areas in motion.
The blocks located in the areas in motion require an intra-picture encoding which
tends to saturate the buffer memory. The severity of the weighting and the quantification
is chosen such that the memory is not saturated by the information corresponding
to these blocks. The weighting is a function of the filling of the buffer memory
in such a way that it acts to a lesser degree against the coefficients or the differences
of coefficients corresponding to the high spatial frequencies, as the filling diminishes.
But the severity of the weighting remains such that the buffer memory empties when
there is a series of blocks encoded by the inter-picture encoding, in order to
have a capacity available for the encoding of blocks encoded by an intra-picture
encoding which produces a large amount of information to be transmitted.
Thus when one or more successive blocks have to be encoded by an intra-picture
encoding, the buffer memory is not close to saturation and consequently the quantification
coefficient does not have to be suddenly increased in order to avoid an overshooting
of the capacity of the buffer memory when a block to be encoded by the intra-picture
encoding arrives. By avoiding sudden variations in the quantification coefficient,
this regulation process avoids suddenly degrading the quality of restoration of
the picture from one block to another in the same picture. This makes it possible
to avoid a visible contrast in the quality of restoration, between blocks encoded
by inter-picture encoding and blocks encoded by intra-picture encoding, close to
each other in the same picture.
Since the method according to the invention consists in multiplying the transformation
coefficients or differences of transformation coefficients corresponding to the
brightness and to the colour difference signals, by the same weighting coefficient
and by the same quantification coefficient, apart from the application of a constant
multiplication factor, these three types of signals are therefore encoded with
the same quality within a same block, and the information to be transmitted can
be stored in a common buffer memory. This storage in a common buffer memory enables
the transmission of the encoded information with any ratio between the quantities
of information corresponding to the three types of signals. The absence of an imposed
ratio between the quantities of information transmitted for the three types of
signals enables, with equal fidelity, an important gain in the compression rate
of the pictures.
In fact, the quantity of information to be transmitted for the two colour difference
signals is extremely variable depending on the scenes represented by the pictures.
When the pictures have colours which are not very saturated, the quantity of information
to be transmitted for the colour difference signals is low. In this case, the common
regulation enables the transmission of a reduced quantity of information for the
colour difference signals, unlike the conventional method of independent regulation
for the three types of signals, which leads to the use of three independent buffer
memories and imposes a constant ratio between the quantities of information transmitted
for the three types of signals.
The filling E
b of the buffer memory must be known before starting
the encoding of a block in question. It must take into account the encoded information
corresponding to all of the blocks which precede the block in question. It is computed
by adding the quantities of information to be transmitted for all of the blocks
preceding the block in question, and by subtracting from this sum the quantity
of information transmitted, computed by taking the product of the data rate of
the transmission channel and the duration which has elapsed between the start of
the transmission and the end of the transmission of the information encoding the
block preceding the block in question.
After the quantification, each value of a transformation coefficient or difference
of transformation coefficients, with the exception of the first coefficient in
each block, and with the exception of the zero values, is encoded by a Huffmann
code. The transformation coefficients or differences of transformation coefficients
of one block are considered successively in the order of scanning corresponding
to increasing u
2+v
2, according to FIG. 1. The coefficients
or differences of coefficients which are zero are encoded by sequences, the lengths
of the sequences being encoded by a Huffmann code. The order of scanning chosen
is such that the series of coefficients or of differences of coefficients of a
block always ends with a long sequence of zero values. The first coefficient or
the first difference of coefficients of each block, having 0 as a line index and
as a column index, is transmitted without Huffmann encoding.
In order to optimize the reduction of the data rate, the Huffmann encoding is
performed according to 8 different trees;
A1, for encoding the transformation coefficients corresponding to the brightness
signal, which are preceded by a sequence of zeroes;
A2, for encoding the transformation coefficients corresponding to the brightness
signal, which are not preceded by a sequence of zeroes;
A3, for encoding the differences of transformation coefficients corresponding
to the brightness signal, which are preceded by a sequence of zeroes;
A4, for encoding the differences of transformation coefficients corresponding
to the brightness signal, which are not preceded by a sequence of zeroes;
A5, for encoding the transformation coefficients corresponding to any of the
colour difference signals, which are preceded by a sequence of zeroes;
A6, for encoding the transformation coefficients corresponding to any of the
colour difference signals, which are not preceded by a sequence of zeroes;
A7, for the encoding of the differences of transformation coefficients corresponding
to any of the colour difference signals, which are preceded by a sequence of zeroes;
A8, for the encoding of differences of transformation coefficients corresponding
to any of the colour difference signals, which are not preceded by a sequence of zeroes.
However, it is possible to use identical encoding trees A5 and A7; and identical
encoding trees A4, A6 and A8, at the cost of a certain degradation in the compression
rate. It should also be noted that the encoding trees also encode particular events:
packing bits and data separator words.
The fact of using two different trees for encoding the values of coefficients
or of differences of coefficients which are not zero and which are not preceded
by a sequence of zeroes and for encoding the coefficients or differences of coefficients
which are not zero and which are preceded by a sequence of zeroes, results in a
reduction in the quantity of information to be transmitted in the order of 10%
with respect to the known methods in which a single encoding tree is used for these
two separate cases. The reason for this reduction is as follows: a priori, it would
be necessary to have two separate trees, with a prefix to distinguish them from
each other, in order to encode on the one hand a sequence of zeroes and, on the
other hand, to encode a coefficient or a non-zero coefficient difference. But there
are never two consecutive sequences of zeroes, as in this case they would be encoded
as a single sequence. Consequently, it is certain that after a sequence of zeroes
there is a coefficient or a difference of coefficients which is not zero. It is
known information which does not therefore have to be transmitted. The use of the
two trees mentioned above enables this redundancy of information to be exploited
in order to reduce the quantity of information to be transmitted.
In this example of implementation, the Huffmann codes used for encoding the coefficients
have a dynamic range limited to -63, +63. Those used for encoding the differences
of coefficients have a dynamic range limited to -31, +31. In the case of overshoot,
at least one overshoot prefix is added. In order to distinguish the 0 modulo +64
and the 0 modulo -64 values, two distinct code words are added. The multiple values
of +64 and -64 are respectively denoted 0+ and 0- in the following description
and they encoded using several overshoot prefixes. The last sequence of zeroes
in each block is not encoded, the encoded data corresponding to each block being
separated by an inter-block synchronization word. The value of the first coefficient
or of the first difference of coefficients in each block is represented in clear
by nine bits.
The eight Huffmann trees satisfy the following conditions:
the code words all have a length less than 16 bits;
no licit concatenation of code words must make a series of 10 zeroes appear, consequently:
- the code words cannot terminate, unless by exception, in more than 5 zeroes;
- the word "00000" is reserved for precise uses;
- the code words cannot begin with more than 4 zeroes;
only the encoding trees corresponding to the colour difference signals include
synchronization words.
The encoding tree A1 encodes 195 events. The non-zero coefficients give rise
to 129 possible events which are: the values -63, . . . , -1, 1, . . . , +63; an
overshoot prefix relating to the coefficients; a value which is a multiple of +64,
which is referenced 0+; a value which is a multiple of -64, which is referenced 0-.
The sequences of zeroes give rise to 65 possible events: the values of length:
1,. . . , 63: an overshoot prefix relating to the sequences of zeroes; and a zero
value 0
p associated with the sequences of zeroes.
A particular event is constituted by a packing.
The conditions which this tree A1 must satisfy are as follows: the event 0+ must
be encoded by "00000"; the code words must end in at least 3 zeroes; and the overshoot
prefix for the coefficients must end in 1.
The tree A2 encodes 129 events: the values -63, . . . , -1, 1, . . . , 63; an
overshoot prefix relating to the coefficients; a value which is a multiple of +64,
referenced 0+; a value which is a multiple of -64, referenced 0-. This tree must
satisfy the following condition: the shortest code word must have a length of two
bits and be constituted by "00". There is no prohibited event.
The tree A3 encodes 195 events and has the following characteristics:
the non-zero coefficients give rise to 129 events: the values -63, . . . , -1,
1, . . . , 63; an overshoot prefix relating to the coefficients; a value which
is a multiple of +64; and a value of which is a multiple of -64;
the sequences of zeroes give rise to 65 events: the values 1, . . . , 63; an
overshoot prefix relating to the sequences of zeroes; a value 0
p relating
to the sequences of zeroes;
a particular event is constituted by a packing.
This tree A3 must satisfy the following conditions: the value which is a multiple
of +64 is encoded by "00000"; the code words must not end in more than 4 zeroes;
the length of the code words, for the lengths of sequences of zeroes, is greater
than three bits; and the overshoot prefix relating to the coefficients must end
in a 1.
The tree A4 must encode 65 events which are: the values of non-zero coefficients:
-31, . . . , -1, +1, . . . , 31; an overshoot prefix relating to coefficients;
a value which is a multiple of +31; a value which is a multiple of -31. This tree
must satisfy the following condition: the shortest code word must have a length
of one bit and be equal to 0. There is no prohibited event.
The tree A5 encodes 131 events and must have the following characteristics: 65
events for the coefficients, constituted by the values -31, . . . , -1, +1, . .
. , 31; an overshoot prefix relating to coefficients, 0+, 0-; 65 events for the
sequences of zeroes; the values 1, . . . , 63, an overshoot prefix associated with
the sequences of zeroes, and the zero length; a particular event which is constituted
by an intra-block synchronization word. This tree must satisfy the following conditions:
the value 0+ must be encoded by 00000; the sequences of zeroes cannot end in more
than 4 zeroes; the length of the sequences of zeroes must be greater than 3; and
the code of the overshoot prefix relating to the coefficients must end in 1.
A particular event is constituted by an intra-block synchronization word.
The tree A6 encodes 65 events and it is identical to the tree A4.
The tree A7 encodes 131 events and it is identical to the tree A5.
The tree A8 encodes 65 events and it is identical to the tree A5.
An optional refinement of the method according to the invention consists in making
the weighting coefficient and the quantification coefficient functions of a parameter
called the category of the block of elements being encoded. This parameter represents
the difficulty of restoration of this block of picture elements. In fact, experience
shows that the worst restored blocks are characterized by the fact that they contain
at least one relatively uniform dark area extending over at least one block of
picture elements adjacent to the block in question, the boundary between the two
blocks passing through the dark area over a relatively long length. In such a case
the dark area is encoded differently on either side of the boundary, which makes
the division of the picture into blocks visible, particularly because the granular
noise is not restored in the same way and is visible particularly in a dark area.
In one example of implementation, the method consists in classifying the blocks
of picture elements into eight categories numbered from 1 to 8 according to the
increasing difficulty in restoring them without making the boundaries between the
blocks appear. It consists in sub-dividing each block of 16×16 elements into
blocks of 4×4 elements, then in computing the average value of the brightness
in each of the sub-blocks of 4×4 elements. In practice, only the sub-blocks
located at the periphery of a block are considered.
FIG. 7 shows an example of a block, with the twelve sub-blocks for which an
average value of the brightness is computed. The latter are cross-hatched. The
method then consists in computing an average value of brightness in areas of elongated
shape located at the periphery of the blocks of picture elements and covering two
adjacent sub-blocks. These areas partially overlap. In FIG. 7 the sub-blocks are
numbered in the clockwise direction starting from the one located in the upper
left-hand corner. In FIG. 8 the areas are numbered in the clock-wise direction
starting from the one located in the top left corner. For example the area N
o1
covers the sub-block N
o1 and sub-block N
o2. The area N
o2
covers the sub-block N
o2 and the sub-block N
o3. The average
brightness value in an area is equal to the half-sum of the average brightness
values of the two sub-blocks enclosed by this area. This average brightness value
is computed according to the following formula:
L(area
Noi)=½(
L(block
Noi)+
L(block
No(
i+1))) (9)
For i=1 to 12; and where L(block N
oi) and L(block N
oi(+1))
respectively represent the average brightness value in block N
oi and
the average brightness value in block N
oi+1.
The method then consists in determining the area having the smallest average
brightness value. This minimum brightness value is referenced L min and determines
the difficulty of encoding the block in question. The method then consists in classifying
the block in question into a category of difficulty from among eight categories,
by comparing the minimum brightness value with seven threshold values and consists
in determining a weighting coefficient and a quantification coefficient depending
on the category of difficulty determined for