WO2007104524A1 - Method and apparatus for blockwise compression and decompression for digital video stream decoding - Google Patents

Method and apparatus for blockwise compression and decompression for digital video stream decoding Download PDF

Info

Publication number
WO2007104524A1
WO2007104524A1 PCT/EP2007/002164 EP2007002164W WO2007104524A1 WO 2007104524 A1 WO2007104524 A1 WO 2007104524A1 EP 2007002164 W EP2007002164 W EP 2007002164W WO 2007104524 A1 WO2007104524 A1 WO 2007104524A1
Authority
WO
WIPO (PCT)
Prior art keywords
pixel
block
pixels
variable length
differences
Prior art date
Application number
PCT/EP2007/002164
Other languages
French (fr)
Inventor
Eike Grimpe
Yin-Chun Blue Lan
Clair Cho
Original Assignee
Micronas Gmbh
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Micronas Gmbh filed Critical Micronas Gmbh
Priority to EP07723201A priority Critical patent/EP1994762A1/en
Publication of WO2007104524A1 publication Critical patent/WO2007104524A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/42Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
    • H04N19/436Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation using parallelised computational arrangements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/13Adaptive entropy coding, e.g. adaptive variable length coding [AVLC] or context adaptive binary arithmetic coding [CABAC]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/132Sampling, masking or truncation of coding units, e.g. adaptive resampling, frame skipping, frame interpolation or high-frequency transform coefficient masking
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/136Incoming video signal characteristics or properties
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/184Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being bits, e.g. of the compressed video stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/50Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding
    • H04N19/593Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial prediction techniques

Definitions

  • the present invention relates to digital video decompression, and, more specifically to an efficient method of coding and decoding a stream of video data bits and an apparatus therefor.
  • Video compression is useful for a wide range of applications which, for example, include video telephony systems, surveillance systems, DVD systems, and digital TV systems. Video compression techniques significantly reduce required storage space for storing video data, and transmission time for electronically transmitting video date without significantly reducing the image quality.
  • Y hereby represents the degree of "Luminance”
  • Cb and Cr U and V
  • a picture is subdivided into blocks of 8x8 pixels, wherein each pixel is represented by the above mentioned Y, Cb, Cr pixel elements, and wherein these pixel blocks go through similar compression procedures individually.
  • l-frames intra-frames
  • P- frames forward predicted frames
  • B-frames bi-directional predicted frames
  • An l-frame is encoded as a single image, with no reference to any past or future frames, i.e. a block of 8x8 pixels is encoded by using only pixel information from the block itself.
  • a P- frame is encoded relative to the past reference frame, wherein the reference frame may be an l-frame or a P-frame, i.e. a block of 8x8 pixels is encoded using by using pixel information from the block itself and by using pixel information from a block of a previous frame.
  • a B-frame is encoded relative to the past reference frame, the future reference frame, or both frames.
  • each of the reference frames may be an l-frame or a P-frame.
  • the JPEG algorithm is a still image compression algorithm that includes a Discrete Cosine Transform (DCT), a quantization and a variable length coding (VLC).
  • DCT Discrete Cosine Transform
  • VLC variable length coding
  • the present invention is related to a method and apparatus of the video data stream decoding, which speeds up the procedure of reconstructing the digital video with less power consumption.
  • the present invention reduces the computing times compared to its counterparts in the field of video stream decompression.
  • a method according to an example of the invention for compressing a block of pixels comprises: calculating a prediction value for each pixel of a group of pixels of the block, wherein the prediction value for a given pixel is calculated dependent on pixel values of at least one neighboring pixel of the given pixel, calculating the difference between the pixel value and the prediction value for each pixel of the group, variable length encoding the calculated differences, wherein a same set of coding parameters of the variable length encoding is used for encoding each difference of the block.
  • An apparatus for compressing a block of pixels comprises: means for calculating a prediction value for each pixel of a group of pix- els of the block, wherein the prediction value for a given pixel is calculated dependent on pixel values of at least one neighboring pixel of the given pixel, means for calculating the difference between the pixel value and the prediction value for each pixel of the group, means for variable length coding the calculated differences, wherein a same set of coding parameters of the variable length coding is used for coding each difference of the block.
  • Fig.1 serves to explain the three types of motion video coding according to MPEG.
  • Fig. 2 depicts a block diagram of a video compression system comprising two reference frame buffers.
  • Fig. 3 illustrates the mechanism of motion estimation.
  • Fig. 4 illustrates a block diagram of decoding a video stream.
  • Fig. 5 depicts the block diagram of block based motion compensation which depicts the way of recovering block pixels with weighted referencing pixels value of previous and next frame pixels.
  • Fig. 6 depicts a block diagram of decoding a video stream using com- pression and decompression of reference frames.
  • Fig. 7 illustrates subdividing a frame into blocks of pixels.
  • Fig. 8 illustrates calculating prediction values for pixels of a block.
  • Fig. 9 shows pixel values and corresponding prediction values of blocks of pixels.
  • Fig. 10 illustrates the distribution of differences between pixel values and prediction values.
  • Fig. 11 illustrates mapping of differences to positive values.
  • Fig. 12 illustrates variable length coding
  • Fig. 13 shows a table including several differences and their VLC codes for different coding parameters.
  • Fig. 14 shows an example of a test encoder.
  • Fig. 15 shows a system including a test encoder and a variable length encoder.
  • Fig. 16 gives an overview over the compression method.
  • Fig. 17 depicts a coding scheme for a fixed length coding method.
  • l-frame 11 the "Intra-coded" frame, uses the block of pixels within the frame to code itself.
  • P-frame 12 the "Predictive” frame, uses previous l-frame or P-frame as a reference to code the differences between these frames.
  • B-frame 13 the "Bi-directional" interpolated frame, uses previous l-frame or P-frame 12 as well as the next l-frame or P- frame 14 as references to code the pixel information.
  • the image quality is the best of the three types of pictures, and requires least computing power in en- coding.
  • the encoding procedure of the l-frame is similar to that of the JPEG picture.
  • a lower bit rate of encoded B- frames as compared to P-frames and l-frames is due to the fact that the averaging block displacement of a B-frame to either previous or next frame is less than that of the P-frame and that the quantization step is larger than that in a P- frame.
  • a B-type frame is not allowed for reference by other frame of picture.
  • Fig. 2 shows a block diagram of the MPEG video compression procedure.
  • a multiplexer (MUX) 221 directly routes an incoming data stream, which includes pixel data, 21 to a Discrete Cosine Transform (DCT) unit 23.
  • DCT coefficients provided by the DCT unit are quantized using a quantization unit connected upstream to the DCT unit.
  • Quantized DCT coefficients, which are provided by the quantization unit 25 are packed as pairs of "Run-Length" code, which has patterns that will later be counted and be assigned code with variable length by a Variable Length Coding (VLC) encoder 27.
  • VLC Variable Length Coding
  • the Variable Length Coding depends on the on the structure (patterns) in a picture represented by the pixel data.
  • a so-called “System Layer Encoder” may be connected upstream to the VLC encoder.
  • Such encoder may add a system header and further system level information including resolution, frame rate, etc. to the encoded bit stream.
  • a compressed (l-frame or P-frame) bit stream will then be reconstructed by a decompression unit (Re-constructor) 29 29 and the reconstructed data (bit) stream is stored in a reference frame buffer 26 as a reference for following (future) frames.
  • Re-constructor decompression unit
  • the incoming data stream is sent to a motion estimator 24 to compare pixel data of a current macro block with pixel data of a previous macro block stored in the reference frame buffer.
  • "Macro block" in this connection de- notes a sub-block of pixels within a frame. A macro block of a given frame is compared to several macro blocks of a previous frame for the searching of the best match macro block.
  • a Predictor 22 calculates the pixel differences between a current macro block and the the best match macro block of previous frame or next frame provided by the motion estimator. The block difference is then fed into the DCT 23, quantization 25, and VLC 27 coding, which is the same procedure like the I- frame coding.
  • the multiplexer is controlled by a controlling unit (not shown) to switch its input between the incoming data stream and the difference data stream at the output of predictor 22, depending on which type of encoding is desired.
  • the first step is to find the difference of the targeted frame, followed by the coding of the difference.
  • a frame is partitioned into macro blocks of 16x16 pixels to estimate the block difference and the block movement.
  • the "best match" macro block in the previous frame or in the next frame has to be found.
  • the mechanism of identifying the best match macro block is called "Motion Estimation”.
  • a "macro block” is composed of four 8x8 “blocks” of luminance values (Y) and one, two, or four chrominance values (2 Cb and 2 Cr), wherein Cb and Cr may also be denoted as U and V.
  • Motion estimator searches for the best match macro block within a predetermined searching range 33, 36. By comparing the mean absolute differences, MAD or sum of absolute differences, SAD, the macro block with the least MAD or SAD is identified as the "best match" macro block. Once the best match blocks are identified, the MV between the targeted block 35 and the best match blocks 34, 37 can be calculated and the differences between each block within a macro block are encoded accordingly. This kind of block difference coding technique is called "Motion Compensation".
  • BMA Best Match Algorithm
  • a search- ing range for example +/- 16 pixels in both X- and Y-axis, is most commonly defined.
  • the mean absolute difference, MAD or sum of absolute difference, SAD as shown below, is calculated for each position of a macro block within the predetermined searching range, for example, a +/- 16 pixels of the X-
  • Vn and Vm stand for the 16x16 pixel array
  • i and j stand for the 16 pixels of the X-axis and Y-axis separately
  • the dx and dy are the change of position of the macro block.
  • the macro block with the least MAD (or SAD) is from the BMA definition named the "Best match" macro block.
  • the calculation of the motion estimation consumes most computing power in most video compression systems.
  • Fig. 4 illustrates the procedure of the MPEG video decompression.
  • the compressed video stream with system header having many system level information including resolution, frame rate, ... etc. is decoded by a system layer decoder (SLD) 40 and a decoded data stream representing a compressed pixel data stream is provided to a variable length decoder (VLD) 41 , that inverts the VLC operation of VLC encoder (27 in Fig. 2) and provides decoded DCT coeffi- cients.
  • VLD variable length decoder
  • These decoded DCT coefficients are de-quantized by a dequantization unit 42 before they go through the inverse DCT performed by inverse discrete cosine transform (iDCT) unit 43.
  • iDCT inverse discrete cosine transform
  • the output of the iDCT are the difference between a current frame and a referencing frame. Such difference goes through motion compensation unit 44 for recovering the original pixels. Since motion compensation at least requires one next frame or one previous frame a decoded l-frame or P-frame may temporarily be stored in a frame buffer 49.
  • the frame buffer 49 may comprise a previous frame buffer 46 for storing a previous frame, as compared to a current frame to be decompressed, and a next frame buffer 47 for storing a next frame, as compared to a current frame to be decompressed.
  • a memory controller When decompressing the a P-type frame or a B-type frame, a memory controller (not shown) will access the frame buffer and transfer some blocks of pixels of previous frame and/or next frame to the current frame for motion compensation. However, transferring block pixels to and from the frame buffer consumes a lot of time and I/O bandwidth of the memory or other storage device.
  • a multiplexer 45 at the output of the decompression unit switches between the output of the iDCT unit and the output of the motion compensation unit to provide a decompressed output data stream.
  • a switching signal for the multiplexer 45 may be provided by the system layer de- coder 40.
  • Fig. 5 shows the motion compensation of block pixels of a B-type frame.
  • the pixel values of a pixel block of previous frame 52 and the pixel values of a pixel block of next frame 53 are weighted using first and second weighting fac- tors (0.5 for example).
  • the weighted pixel values are added to the decoded pixel difference of the current frame, thereby obtaining a reconstructed block of pixels.
  • frames to be stored in the frame buffer 49 are compressed on a block basis prior to storing. This reduces I/O bandwidth of the memory or storage device forming the frame buffer 49.
  • Fig.6 shows a MPEG decompression unit comprising a compression/decompression unit 50 connected between iDCT unit 43 and frame buffer and between frame buffer 49 and motion compensation unit 44.
  • the compression/decompression unit 50 compresses a pixel data stream provided by iDCT unit on a block basis and stores a compressed data stream in frame buffer.
  • compression/decompression unit 50 reads a compressed data stream, that represents the required frame, from the buffer, decompresses the data stream and provides decompressed data stream to the motion compensation unit 44.
  • the compres- sion/decompression unit 50 that compresses pixel data prior to storing, is not restricted to be used in MPEG decompression methods. It may also be used in MPEG compression methods, or in any other method that requires storing and/or transmitting of pixel data.
  • each 4x4 block of pixels is represented by a 4x4 block of luminance data values and two 2x2 blocks of chrominance data values.
  • reference number 61 denotes a block of 4x4 luminance data values Yn...Y 44
  • reference numbers 62 and 63 denote blocks of 2x2 chrominance data values U11...U 22 and V 11 ...V 22 .
  • each data value is represented by a binary word of 8 bit
  • the method codes differences between neighboring pixels, where the value of one pixel is used as prediction for the value of an adjacent pixel.
  • a possible prediction scheme will be described in the following with reference to Fig. 8. It is assumed that X represents a pixel in either of blocks 61 , 62, 63. The prediction value for X is then dependent on the position of X within the block and on neighboring pixel values.
  • U denotes the pixel value of the upper neighbor pixel to X
  • L denotes the pixel value of the neighbor to the left to X
  • S denotes the pixel value of the upper-left neighbor to X, which is the neighbor to the left to the pixel having pixel value U.
  • U is in the same column as X, and adjacent to X
  • L is in the same line as X and adjacent to X
  • S is in the same line as U and in the same column as L. S may also be denoted an adjacent diagonal neighbor to X.
  • the prediction value for X is
  • Fig. 9 shows examples for pixel values of pixels in blocks 61 , 62, 63 and the resulting prediction values obtained by applying the above mentioned prediction strategy.
  • A denotes a pixel value of any pixel in blocks 61 , 62, 63
  • B denotes the corresponding prediction value.
  • the value to be decoded is the difference A-B between the pixel value A and the predicted value B, except for the upper left pixels Y 11 , U 11 , V 11 , which are coded directly.
  • a and b range between 0 and 255. Consequently, the difference A-B ranges between -255 and 255, i.e.
  • the differences A-B are mapped to values of a given interval including
  • 2 8 256 different values, wherein these values are coded, as will be described below. It is obvious, that more than one difference has to be mapped to the same value of the mapping interval if there are 2 9 different differences and only 2 8 different values for representing those differences. However, by applying a mapping scheme that uses prediction value B as a mapping parameter, the correct difference can be reconstructed from the mapped value using the prediction value B.
  • the prediction values B are calculated using pixel values of neighboring pixels, except for the upper left pixels Yn, Un, Vn. Let it be assumed, that pixel values of these pixels Yn, Un, Vn after coding (and com- pression) may be reconstructed correctly. From these pixel values the prediction values for the neighboring pixels Y12, Y21, Y22. U12, U21, U22, V 12 , V 2 i, V 22 can then be calculated. Let it further be assumed that after coding (and compression) the differences assigned to these pixels may be constructed correctly. Using the prediction values and the differences the pixel values of these neighboring pixels can then be reconstructed correctly. Starting from these pixel values the prediction values for further neighboring pixels can then be calculated....
  • Mapping a difference A-B to a value within the interval of 256 different values may be performed by simply shifting the difference A-B using the prediction value B to a positive value.
  • the mapping scheme according to Eqns. 4a to 4c maps differences in the range of -B ⁇ C ⁇ B to target values between 0 and 2 B, while differences C larger than B are shifted using B as a shifting parameter.
  • VLC variable length coding
  • m 2 X , with m ⁇ D, Q denotes the integer result obtained by dividing D by m, and R denotes the remainder of the division operation.
  • integer Q, and remainder R are mapped to a code as shown in Fig. 12.
  • the code includes q leading bits, each which is equal to value zero, and with q equal to Q, followed by a single marker bit with value 1 , followed by the binary equivalent of the remainder R.
  • the bit length of the remain- der R is x.
  • the code length may further be decreased by, prior to mapping the target value D to the code, truncating one or more less significant bits (LSB) of the target value, and therefore by decreasing the range of differences within a block (lossy encoding). For example up to four different modes of VLC coding with four different truncation lengths may be used.
  • LSB less significant bits
  • Fig. 14 shows an example of a test encoder arrangement 70 for evaluat- ing the m value and the truncation length which is to be applied for coding a given Y block and given U and V blocks.
  • Y_trunc denotes the truncation length applied to differences of Y block (61 in Fig. 7)
  • UV_trunc denotes the truncation lengths applied to differences of U and V blocks (62, 63 in Fig.
  • the test encoder arrangement 70 of Fig. 14 comprises four test encoders 71 , 72, 73, 74, with each test encoder consisting of two separate encoders for the luminance data 71 Y 73Y and the chrominance data 71 UV, ... ,73UV.
  • Each of the luminance data encoders 71Y 1 ... ,74Y VLC encodes a complete Y block using Y_trunc and Y_x parameters, and counts the overall number of bits
  • each of the chrominance data encoders 71 UV 74UV VLC encodes a complete U and a complete V block using UV_trunc and UV_x parameters, and counts the overall number of bits.
  • Y_x and UV_x of one encoder unit may be identical or may be different
  • Y_trunc and UV_trunc of one encoder unit may be identical or may be different.
  • the number of bits provided by the luminance and chrominance data encoders pair of one encoder unit are added by adders 75, 76, 77, 78, and result- ing overall bit count values BitCtO, BitCti , BitCt2, BitCt3 are provided to a mode selector 79, which based on the overall bit count values and based on the desired number of coded bits provides a mode select signal MS.
  • a mode selector 79 which based on the overall bit count values and based on the desired number of coded bits provides a mode select signal MS.
  • Mode select signal MS is provided to an encoder 80, that VLC codes the given Y block and the given UV blocks thereby taking into account mode select signal MS.
  • the mode select signal MS may include Y_trunc, UVJrunc, Y_x and UV_x.
  • Encoder 80 outputs a compressed data stream, that represents the VLC encoded differences A-B obtained for each pixel of blocks 61 , 62, 63 by subtracting prediction value B from pixel value A.
  • the data stream provided by encoder 80 also includes codes representing pixel data of the upper left pixels Yn, Un, Vn of the blocks. These pixel data is coded directly using a fixed length code, where one or more LSBs may be truncated prior to coding.
  • Fig. 16 gives an overview of the block compression method according to an example of the invention.
  • the method comprises the prediction of pixel values for block pixels using neighboring pixels, forming the difference between pixel values and predicted values, reordering the differences, and VLC coding the reordered difference.
  • reordering the differences may include mapping the differences to all positive target values.
  • Decompressing a compressed data stream includes VLC decoding, reordering back the differences obtained by decoding, prediction starting with Yn, Un, Vn as explained above, and calculating decompressed pixel values from the differences and the predicted values, wherein prediction for given pixels may require decompressed pixel values of neighboring pixels.
  • the truncated LSBs are added on a pseudo ran- dom basis during the decompression process.
  • a fixed length coding using LSB truncation may be used for mapping reordered difference values, or target values, D to codes.
  • Fig. 17 shows an example for assigning bit lengths shorter than 8 bits to the pixels of the different blocks.
  • the average bit length in the example is somewhat higher than 5, wherein pixels having a bit lengths of 6, i.e. higher than 5, are Y pixels. This is due to the fact, that errors which result from truncation are rather visible for Y pixel than for U or V pixels, so that the truncation error for Y pixels is tried to be kept as small as possible.
  • the present invention relates to an efficient video bit stream compression and decompression that applies a new compression and decompression method, and relates to an apparatus for reducing the data rate of the compressed blocks of image.
  • These compressed blocks may be used as reference for other non-intra type blocks of image in motion compensation.
  • the invention applies the following main new concept to achieve low bit rate of storing the decompression reference frame/block:
  • Multiple encoding and decoding engines are applied with filling the bit stream at a time to achieve highest throughput in each clock cycle time.
  • the complexity and quality of the uncompressed block of pixels is analyzed (by way of test encoding) and a decision on the compression mode (in particular coding parameters of the VLC encoding) to be applied is made based on the analysis.
  • the differential values of continuous adjacent pixels within a block mostly likely the 1 st row of a block will be calculated for VLC (Variable length) coding.
  • the differential values of continuous adja- cent pixels within a block mostly likely the 1 st column of a block will be calculated for VLC (Variable length) coding.
  • the differential values of upper pixel and left pixels within a block will be calculated. The smaller one will be coded by VLC (Variable length) coding.
  • the differential values of adjacent pixels will be adjusted to be all positive which reduces the bit requirement from 9 bits down to 8 bits.
  • each color component (Y, U and V) is compressed and decompressed separately with the final data rate combined together to fit a predetermined bit rate.
  • each color component (Y, U and V) has one pixel as the reference for the rest of other pixels.
  • variable clock cycle time will have different encoder and decoder encoding and decoding different pixels.
  • some encoders and decoders will encode and decode more than one color component in different cycle time.
  • multiple encoders and decoders are implemented to pipelining encode and decode multiple pixels in each cycle time which speeds up the performance and providing higher throughput.
  • the adjusted differential value, D n of adjacent pixels will be coded by divding the D n by a predicted divider to achieve a result of Quotient (Q) and a Remainder (R).
  • the divider is predicted without inserting bits into represent which reduces the bit rate.
  • a predetermined number of dividers are applied and tested to determine which reaches the lowest data rate which will be selected as the divider for coding the pre- dieted differential values of pixels.
  • a code (Ex. "00000000" is inserted to represent that the Remainder of its following 8 bits is the same value of the predicted D n to be coded which limited the max. length of code to be 16 bits.
  • the compression and decompression method may be used in video stream decoding, with the decoding comprising Reconstructing the non-B-type frame or macro-block by standard video decoding procedures;
  • the invention also relates to:
  • a method for compressing block of pixels comprising:
  • a method for compressing and decompressing block of pixels with predetermined bit rate of compressed block of pixels comprising dur- ing compression:
  • An apparatus for decoding a video stream comprising:
  • a first engine decompressing the video stream of non-B-type frame block by block into a first video format
  • a first compression engine reduces the bits stream with the first video format to a predetermined length of bits and stores into a storage device for other pictures' reference;
  • a second decompression engine recovering the block pixels accessed from the compressed referencing image buffer
  • Another decompression engine reconstructing the block pixels of the com- pressed video stream of B-type frame or macroblock
  • a motion compensation engine adding the decompressed block pixels and the recovered block pixels of the reference image and reconstructing the block pix- els.

Abstract

Disclosed is a method and an apparatus for compressing a block of pixels,. The method comprises: calculating a prediction value for each pixel of a group of pixels of the block, wherein the prediction value for a given pixel is calculated dependent on pixel values of at least one neighboring pixel of the given pixel, calculating the difference between the pixel value and the prediction value for each pixel of the group, variable length encoding the calculated differences, wherein a same set of coding parameters of the variable length encoding is used for encoding each difference of the block.

Description

METHOD AND APPARATUS FOR BLOCKWISE COMPRESSION AND DECOMPRESSION FOR DIGITAL VIDEO STREAM DECODING
BACKGROUND OF THE INVENTION
The present invention relates to digital video decompression, and, more specifically to an efficient method of coding and decoding a stream of video data bits and an apparatus therefor.
International Standards Organization (ISO) and International Telecommunication Union (ITU) have developed and defined digital video compression standards MPEG-1 , MPEG-2, MPEG-4, MPEG-7, H.261 , H.263 and H.264. Video compression is useful for a wide range of applications which, for example, include video telephony systems, surveillance systems, DVD systems, and digital TV systems. Video compression techniques significantly reduce required storage space for storing video data, and transmission time for electronically transmitting video date without significantly reducing the image quality.
Most ISO and ITU motion video compression standards adopt Y, Cb (U) and Cr (V) as the pixel elements, which are derived from the original R (Red), G (Green), and B (Blue) color components. Y hereby represents the degree of "Luminance", while Cb and Cr ( U and V) represent the color difference derived from the "Luminance". In both still and motion picture compression algorithms, a picture is subdivided into blocks of 8x8 pixels, wherein each pixel is represented by the above mentioned Y, Cb, Cr pixel elements, and wherein these pixel blocks go through similar compression procedures individually.
In the MPEG video compression standard there are three types of encoded pictures or frames: intra-frames (l-frames), forward predicted frames (P- frames), and bi-directional predicted frames (B-frames). An l-frame is encoded as a single image, with no reference to any past or future frames, i.e. a block of 8x8 pixels is encoded by using only pixel information from the block itself. A P- frame is encoded relative to the past reference frame, wherein the reference frame may be an l-frame or a P-frame, i.e. a block of 8x8 pixels is encoded using by using pixel information from the block itself and by using pixel information from a block of a previous frame. A B-frame is encoded relative to the past reference frame, the future reference frame, or both frames., wherein each of the reference frames may be an l-frame or a P-frame. In principle, in the l-frame encoding, all blocks of 8x8 pixels go through the same compression procedure that is similar to the JPEG algorithm of ITU. The JPEG algorithm is a still image compression algorithm that includes a Discrete Cosine Transform (DCT), a quantization and a variable length coding (VLC). For P-frames and B-frames the difference between a target frame and the one or the two reference frames is encoded. For encoding a P-frame or a B-frame the at least one reference frame has to be stored in a memory and the stored pixel data have to be a ac- cessed for encoding purposes.
Due to the method of encoding decompressing P-frames or B-frames requires at least one reference frame to be stored in a memory and then to be accessed for decoding purposes. Due to input/output (IO) data path limitation of most semiconductor memories, accessing the memory and transferring the pixels of the reference frame stored may turn out to be a bottleneck for most implementations.
SUMMARY OF THE INVENTION
The present invention is related to a method and apparatus of the video data stream decoding, which speeds up the procedure of reconstructing the digital video with less power consumption. The present invention reduces the computing times compared to its counterparts in the field of video stream decompression. A method according to an example of the invention for compressing a block of pixels, comprises: calculating a prediction value for each pixel of a group of pixels of the block, wherein the prediction value for a given pixel is calculated dependent on pixel values of at least one neighboring pixel of the given pixel, calculating the difference between the pixel value and the prediction value for each pixel of the group, variable length encoding the calculated differences, wherein a same set of coding parameters of the variable length encoding is used for encoding each difference of the block.
An apparatus for compressing a block of pixels according to an example of the invention comprises: means for calculating a prediction value for each pixel of a group of pix- els of the block, wherein the prediction value for a given pixel is calculated dependent on pixel values of at least one neighboring pixel of the given pixel, means for calculating the difference between the pixel value and the prediction value for each pixel of the group, means for variable length coding the calculated differences, wherein a same set of coding parameters of the variable length coding is used for coding each difference of the block.
BRIEF DESCRIPTION OF THE DRAWINGS
Fig.1 serves to explain the three types of motion video coding according to MPEG.
Fig. 2 depicts a block diagram of a video compression system comprising two reference frame buffers. Fig. 3 illustrates the mechanism of motion estimation.
Fig. 4 illustrates a block diagram of decoding a video stream. Fig. 5 depicts the block diagram of block based motion compensation which depicts the way of recovering block pixels with weighted referencing pixels value of previous and next frame pixels.
Fig. 6 depicts a block diagram of decoding a video stream using com- pression and decompression of reference frames.
Fig. 7 illustrates subdividing a frame into blocks of pixels.
Fig. 8 illustrates calculating prediction values for pixels of a block.
Fig. 9 as an example shows pixel values and corresponding prediction values of blocks of pixels. Fig. 10 illustrates the distribution of differences between pixel values and prediction values.
Fig. 11 illustrates mapping of differences to positive values.
Fig. 12 illustrates variable length coding.
Fig. 13 shows a table including several differences and their VLC codes for different coding parameters. Fig. 14 shows an example of a test encoder.
Fig. 15 shows a system including a test encoder and a variable length encoder.
Fig. 16 gives an overview over the compression method.
Fig. 17 depicts a coding scheme for a fixed length coding method.
DETAILED DESCRIPTION OF THE DRAWINGS
There are essentially three types of picture coding in the MPEG video compression standard as shown in Fig. 1 : l-frame 11 , the "Intra-coded" frame, uses the block of pixels within the frame to code itself. P-frame 12, the "Predictive" frame, uses previous l-frame or P-frame as a reference to code the differences between these frames. B-frame 13, the "Bi-directional" interpolated frame, uses previous l-frame or P-frame 12 as well as the next l-frame or P- frame 14 as references to code the pixel information.
In most applications, since the l-frame does not use any other frame as reference and hence no need of the motion estimation, the image quality is the best of the three types of pictures, and requires least computing power in en- coding. The encoding procedure of the l-frame is similar to that of the JPEG picture.
Encoding B-frames, as compared to l-frames and P-frames, requires most computing power, because the motion estimation needs to be done in relation to both, a previous frame and a next frame. A lower bit rate of encoded B- frames as compared to P-frames and l-frames is due to the fact that the averaging block displacement of a B-frame to either previous or next frame is less than that of the P-frame and that the quantization step is larger than that in a P- frame. In most video compression standard including MPEG, a B-type frame is not allowed for reference by other frame of picture. Thus, errors in a B-frame will not be propagated to other frames and allowing bigger errors in B-frames is more common than in P-frames or l-frames. Encoding of the three MPEG pictures becomes tradeoff among performance, bit rate and image quality. A ranking of these three factors of the three types of picture encoding is given below:
Figure imgf000006_0001
Fig. 2 shows a block diagram of the MPEG video compression procedure. In l-type frame coding, a multiplexer (MUX) 221 directly routes an incoming data stream, which includes pixel data, 21 to a Discrete Cosine Transform (DCT) unit 23. DCT coefficients provided by the DCT unit are quantized using a quantization unit connected upstream to the DCT unit. Quantized DCT coefficients, which are provided by the quantization unit 25, are packed as pairs of "Run-Length" code, which has patterns that will later be counted and be assigned code with variable length by a Variable Length Coding (VLC) encoder 27. The Variable Length Coding depends on the on the structure (patterns) in a picture represented by the pixel data. Optionally a further encoder, a so-called "System Layer Encoder" may be connected upstream to the VLC encoder. Such encoder may add a system header and further system level information including resolution, frame rate, etc. to the encoded bit stream.
. A compressed (l-frame or P-frame) bit stream will then be reconstructed by a decompression unit (Re-constructor) 29 29 and the reconstructed data (bit) stream is stored in a reference frame buffer 26 as a reference for following (future) frames.
In the case of compressing a P-frame, a B-frame or a P-type, or a B-type macro block, the incoming data stream is sent to a motion estimator 24 to compare pixel data of a current macro block with pixel data of a previous macro block stored in the reference frame buffer. "Macro block" in this connection de- notes a sub-block of pixels within a frame. A macro block of a given frame is compared to several macro blocks of a previous frame for the searching of the best match macro block.
A Predictor 22 calculates the pixel differences between a current macro block and the the best match macro block of previous frame or next frame provided by the motion estimator. The block difference is then fed into the DCT 23, quantization 25, and VLC 27 coding, which is the same procedure like the I- frame coding.
The multiplexer is controlled by a controlling unit (not shown) to switch its input between the incoming data stream and the difference data stream at the output of predictor 22, depending on which type of encoding is desired.
In the encoding of the differences between frames, the first step is to find the difference of the targeted frame, followed by the coding of the difference. For some considerations including accuracy, performance, and coding efficiency, in some video compression standards, a frame is partitioned into macro blocks of 16x16 pixels to estimate the block difference and the block movement. For each macro block within a frame the "best match" macro block in the previous frame or in the next frame has to be found. The mechanism of identifying the best match macro block is called "Motion Estimation".
Practically, a block of pixels will not move too far away from the original position in a previous frame, therefore, searching for the best match block within an unlimited range of region is very time consuming and unnecessary. A limited searching range is commonly defined to limit the computing times in the "best match" block searching. The computing power hungered motion estimation is adopted to search for the "Best Match" candidates within a searching range for each macro block as described in Fig. 3. According to the MPEG standard, a "macro block" is composed of four 8x8 "blocks" of luminance values (Y) and one, two, or four chrominance values (2 Cb and 2 Cr), wherein Cb and Cr may also be denoted as U and V. Since luminance and chrominance are closely as- sociated, in the motion estimation, only luminance motion estimation is needed , and the chrominance, U and V in the corresponding position copy the same motion vector (MV) of luminance. The motion vector, MV, represents the direction and displacement of the block movement. For example, an MV=(5, -3) stands for the block movement of 5 pixels right in X-axis and 3 pixel down in the Y-axis. Motion estimator searches for the best match macro block within a predetermined searching range 33, 36. By comparing the mean absolute differences, MAD or sum of absolute differences, SAD, the macro block with the least MAD or SAD is identified as the "best match" macro block. Once the best match blocks are identified, the MV between the targeted block 35 and the best match blocks 34, 37 can be calculated and the differences between each block within a macro block are encoded accordingly. This kind of block difference coding technique is called "Motion Compensation".
The Best Match Algorithm, BMA, is the most commonly used motion es- timation algorithm in the popular video compression standards like MPEG and
H.26x. In most video compression systems, motion estimation consumes high computing power ranging from -50% to ~80% of the total computing power for the video compression. In the search for the best match macro block, a search- ing range, for example +/- 16 pixels in both X- and Y-axis, is most commonly defined. The mean absolute difference, MAD or sum of absolute difference, SAD as shown below, is calculated for each position of a macro block within the predetermined searching range, for example, a +/- 16 pixels of the X-
SAD(x,y)= + i,y + j)- Vm(x + dx + i,y + dy + j)
Figure imgf000009_0001
MAD(x,y) j)-Vm(x + dx + i,y + dy + j}
Figure imgf000009_0002
axis and Y-axis. In above MAD and SAD equations, the Vn and Vm stand for the 16x16 pixel array, i and j stand for the 16 pixels of the X-axis and Y-axis separately, while the dx and dy are the change of position of the macro block. The macro block with the least MAD (or SAD) is from the BMA definition named the "Best match" macro block. The calculation of the motion estimation consumes most computing power in most video compression systems.
Fig. 4 illustrates the procedure of the MPEG video decompression. The compressed video stream with system header having many system level information including resolution, frame rate, ... etc. is decoded by a system layer decoder (SLD) 40 and a decoded data stream representing a compressed pixel data stream is provided to a variable length decoder (VLD) 41 , that inverts the VLC operation of VLC encoder (27 in Fig. 2) and provides decoded DCT coeffi- cients. These decoded DCT coefficients are de-quantized by a dequantization unit 42 before they go through the inverse DCT performed by inverse discrete cosine transform (iDCT) unit 43. A data stream at the output of iDCT unit 43 represents a data stream of time domain pixel information.
In decoding non intra-frames, including P-type and B-type frames, the output of the iDCT are the difference between a current frame and a referencing frame. Such difference goes through motion compensation unit 44 for recovering the original pixels. Since motion compensation at least requires one next frame or one previous frame a decoded l-frame or P-frame may temporarily be stored in a frame buffer 49. The frame buffer 49 may comprise a previous frame buffer 46 for storing a previous frame, as compared to a current frame to be decompressed, and a next frame buffer 47 for storing a next frame, as compared to a current frame to be decompressed. When decompressing the a P-type frame or a B-type frame, a memory controller (not shown) will access the frame buffer and transfer some blocks of pixels of previous frame and/or next frame to the current frame for motion compensation. However, transferring block pixels to and from the frame buffer consumes a lot of time and I/O bandwidth of the memory or other storage device.
Depending on whether intra-frames (l-frames) or inter-frames (P-frames or B-Frames) are to compressed, a multiplexer 45 at the output of the decompression unit switches between the output of the iDCT unit and the output of the motion compensation unit to provide a decompressed output data stream. A switching signal for the multiplexer 45 may be provided by the system layer de- coder 40.
Fig. 5 shows the motion compensation of block pixels of a B-type frame. The pixel values of a pixel block of previous frame 52 and the pixel values of a pixel block of next frame 53 are weighted using first and second weighting fac- tors (0.5 for example). The weighted pixel values are added to the decoded pixel difference of the current frame, thereby obtaining a reconstructed block of pixels.
According to an example of the invention frames to be stored in the frame buffer 49 are compressed on a block basis prior to storing. This reduces I/O bandwidth of the memory or storage device forming the frame buffer 49.
Fig.6, for example, shows a MPEG decompression unit comprising a compression/decompression unit 50 connected between iDCT unit 43 and frame buffer and between frame buffer 49 and motion compensation unit 44. The compression/decompression unit 50 compresses a pixel data stream provided by iDCT unit on a block basis and stores a compressed data stream in frame buffer. In case one of the frames stored in the frame buffer 49 is required for motion compensation, compression/decompression unit 50 reads a compressed data stream, that represents the required frame, from the buffer, decompresses the data stream and provides decompressed data stream to the motion compensation unit 44. It is to be understood, that the compres- sion/decompression unit 50, that compresses pixel data prior to storing, is not restricted to be used in MPEG decompression methods. It may also be used in MPEG compression methods, or in any other method that requires storing and/or transmitting of pixel data.
An example of the compression/decompression method performed by compression/decompression unit 50 will be explained in the following. According to an example the method compresses blocks of 4x4 pixels, where each 4x4 block of pixels is represented by a 4x4 block of luminance data values and two 2x2 blocks of chrominance data values. Referring to Fig. 7 reference number 61 denotes a block of 4x4 luminance data values Yn...Y44, and reference numbers 62 and 63 denote blocks of 2x2 chrominance data values U11...U22 and V11...V22. Assuming that each data value is represented by a binary word of 8 bit, then a data stream of 24 (=16+4+4) bytes has to compressed for each 4x4 pixel block..
Instead of directly coding the pixel data, the method codes differences between neighboring pixels, where the value of one pixel is used as prediction for the value of an adjacent pixel. A possible prediction scheme will be described in the following with reference to Fig. 8. It is assumed that X represents a pixel in either of blocks 61 , 62, 63. The prediction value for X is then dependent on the position of X within the block and on neighboring pixel values. In the following U denotes the pixel value of the upper neighbor pixel to X, L denotes the pixel value of the neighbor to the left to X, and S denotes the pixel value of the upper-left neighbor to X, which is the neighbor to the left to the pixel having pixel value U. In the example U is in the same column as X, and adjacent to X, L is in the same line as X and adjacent to X, and S is in the same line as U and in the same column as L. S may also be denoted an adjacent diagonal neighbor to X. The prediction value for X is
L1 if X is in the first row of a 4x4 luminance or 2x2 chrominance data block (1a)
U, if X is in the first column of a 4x4 luminance or 2x2 chrominance data block (1 b)
None, if X is both, in the first row and in the first column of a 4x4 luminance or 2x2 chrominance data block (1c)
F, otherwise, where
F min(U, L) if S ≥ max(U ,L)
F F = mmaaxx((UU,,L L)) iiff SS ≤≤ mmiinn((UU,,L L))
F U + L - S otherwise (1d)
The result of applying such prediction scheme to blocks 61 , 62, 63 is il- lustrated in the lower part of Fig. 8. Pixels Y11, U11, and V11 for which no prediction is made (Eqn. 1c), are depicted white, pixels for which Eqns. 1a or 1 b apply are depicted in a light grey, while pixels for which Eqn 1d applies are depicted in a darker grey.
Fig. 9, for purpose of illustration, shows examples for pixel values of pixels in blocks 61 , 62, 63 and the resulting prediction values obtained by applying the above mentioned prediction strategy.
In the following A denotes a pixel value of any pixel in blocks 61 , 62, 63, while B denotes the corresponding prediction value. The value to be decoded is the difference A-B between the pixel value A and the predicted value B, except for the upper left pixels Y11, U11, V11, which are coded directly. Assuming pixel values of blocks 61 , 62, 63 have a bit length of 8 bit, then A and b range between 0 and 255. Consequently, the difference A-B ranges between -255 and 255, i.e.
-255 < A-B < 255 where 0 < A < 255 and 0 < B < 255 (2)
In general 9 bit would be required for coding such difference, and thereby allowing lossless decoding. However, assuming that the prediction value is known, the range for the difference A-B to be decoded can be narrowed to
-B < A-B < 255 - B where 0 < A < 255 (3),
which implies that the difference may be encoded using only 8 bit.
The differences A-B are mapped to values of a given interval including
28=256 different values, wherein these values are coded, as will be described below. It is obvious, that more than one difference has to be mapped to the same value of the mapping interval if there are 29 different differences and only 28 different values for representing those differences. However, by applying a mapping scheme that uses prediction value B as a mapping parameter, the correct difference can be reconstructed from the mapped value using the prediction value B.
The reconstruction of prediction values B for decoding or decompression purposes, will be explained in the following:
The prediction values B, as stated above, are calculated using pixel values of neighboring pixels, except for the upper left pixels Yn, Un, Vn. Let it be assumed, that pixel values of these pixels Yn, Un, Vn after coding (and com- pression) may be reconstructed correctly. From these pixel values the prediction values for the neighboring pixels Y12, Y21, Y22. U12, U21, U22, V12, V2i, V22 can then be calculated. Let it further be assumed that after coding (and compression) the differences assigned to these pixels may be constructed correctly. Using the prediction values and the differences the pixel values of these neighboring pixels can then be reconstructed correctly. Starting from these pixel values the prediction values for further neighboring pixels can then be calculated....
Mapping a difference A-B to a value within the interval of 256 different values may be performed by simply shifting the difference A-B using the prediction value B to a positive value.
Another approach for mapping the differences will be explained in the following:
Usually the pixel values of adjacent pixels in an image are correlated to each other. The differences between the pixel values of adjacent pixels, in a first approach, therefore are normally distributed, as depicted in Fig. 10. For effectively applying a variable length coding (VLC) to the differences A-B it is therefore desired to map small differences to small values and to map larger differences to larger values. A possible mapping scheme for mapping a difference C= A-B to a target value D, wherein O≤ D < 255, using prediction value B as a mapping parameter for B < 128 is:
D = B + C for | C | > B (4a)
D = 2 - I C | -1 for I C I < B and C > 0 (4b)
D = 2 I C I for I C I < B and C < 0 (4b) D = O for C = 0 (4c)
The mapping scheme according to Eqns. 4a to 4c maps differences in the range of -B < C < B to target values between 0 and 2 B, while differences C larger than B are shifted using B as a shifting parameter. Fig. 11 A for B=3 shows the distribution of possible differences for B=3 (assuming that the differences are normally distributed) while FIG 11 B shows the possible distribution of the target values obtained by mapping the differences. In the example C=1 is mapped to D=1 , C=-1 is mapped to D=2, ... , C=-3 is mapped to D=5 and C=3 is mapped to D=6.
For B > 128 the same mapping scheme applies except for the mapping according to Eqn. 4a being replaced by
D = -B + C for I C J > B (5a).
The target values obtained by mapping the differences to all positive values are encoded using variable length coding (VLC). For obtaining the code target value D, representing the difference C=A-B, is expressed as
D = m Q + R (5),
where m = 2X, with m < D, Q denotes the integer result obtained by dividing D by m, and R denotes the remainder of the division operation. For each target value D and for a given m, integer Q, and remainder R are mapped to a code as shown in Fig. 12. The code includes q leading bits, each which is equal to value zero, and with q equal to Q, followed by a single marker bit with value 1 , followed by the binary equivalent of the remainder R. The bit length of the remain- der R is x.
Fig. 13 shows a table including several examples for codes obtained for different D values and different m values, where m is 23=8, 24=16 or 2°=1 in the example. Assuming that all the differences between pixels in the block to be en- coded lie in a narrow range owing to a high correlation between pixel values in the block, and further assuming that m fits the differences to be coded well, such that both Q and R become small, short codes can be obtained.
According to an embodiment of the invention divisor m is kept constant for each block of pixels, but may vary for different blocks. Besides the coded differences of one block the value of m used for VLC coding the differences has to be stored or transmitted, and will be required for decoding. The best value for m may be selected for each block using so-called test encoding. In this connec- tion a complete block is encoded several times using a different m each time, the overall number of bits obtained for each block is counted, and the rn value resulting to the lowest number of bits or resulting to the number of bits being closest to a desired number of bits is chosen for the block. Turning to the ex- ample in Fig. 6 and assuming that a compression factor of 1.5 is desired, then the desired number of bits is: 24 8/1 ,5 = 16 8 = 128.
The code length may further be decreased by, prior to mapping the target value D to the code, truncating one or more less significant bits (LSB) of the target value, and therefore by decreasing the range of differences within a block (lossy encoding). For example up to four different modes of VLC coding with four different truncation lengths may be used.
Fig. 14 shows an example of a test encoder arrangement 70 for evaluat- ing the m value and the truncation length which is to be applied for coding a given Y block and given U and V blocks. In Fig. 14 Y_trunc denotes the truncation length applied to differences of Y block (61 in Fig. 7), UV_trunc denotes the truncation lengths applied to differences of U and V blocks (62, 63 in Fig. 7), Y_x with 2Y-X = m denotes the divisor m used for VLC coding of a Y block, and UV_x with 2UV-X = m denotes the divisor m used for VLC coding of U and V blocks. The test encoder arrangement 70 of Fig. 14 comprises four test encoders 71 , 72, 73, 74, with each test encoder consisting of two separate encoders for the luminance data 71 Y 73Y and the chrominance data 71 UV, ... ,73UV.
Each of the luminance data encoders 71Y1... ,74Y VLC encodes a complete Y block using Y_trunc and Y_x parameters, and counts the overall number of bits, and each of the chrominance data encoders 71 UV 74UV VLC encodes a complete U and a complete V block using UV_trunc and UV_x parameters, and counts the overall number of bits. It should be mentioned, that Y_x and UV_x of one encoder unit may be identical or may be different, and that Y_trunc and UV_trunc of one encoder unit may be identical or may be different.
The number of bits provided by the luminance and chrominance data encoders pair of one encoder unit are added by adders 75, 76, 77, 78, and result- ing overall bit count values BitCtO, BitCti , BitCt2, BitCt3 are provided to a mode selector 79, which based on the overall bit count values and based on the desired number of coded bits provides a mode select signal MS. It should be mentioned, that the number of parallel test encoders, and therefore the number of different parameters sets tested in parallel, is not limited to four, but any number of parallel encoders may be used.
Referring to Fig. 15 Mode select signal MS is provided to an encoder 80, that VLC codes the given Y block and the given UV blocks thereby taking into account mode select signal MS. The mode select signal MS may include Y_trunc, UVJrunc, Y_x and UV_x. Encoder 80 outputs a compressed data stream, that represents the VLC encoded differences A-B obtained for each pixel of blocks 61 , 62, 63 by subtracting prediction value B from pixel value A. The data stream provided by encoder 80 also includes codes representing pixel data of the upper left pixels Yn, Un, Vn of the blocks. These pixel data is coded directly using a fixed length code, where one or more LSBs may be truncated prior to coding.
It goes without saying, that besides codes representing values of the up- per left pixels Yn, Un, Vn differences, and codes representing differences, the parameters Y_trunc, UVJrunc, Y_x and UV_x need to be coded and stored or transmitted, since these parameters are required for decompression.
Fig. 16 gives an overview of the block compression method according to an example of the invention. The method comprises the prediction of pixel values for block pixels using neighboring pixels, forming the difference between pixel values and predicted values, reordering the differences, and VLC coding the reordered difference. Referring to the above reordering the differences may include mapping the differences to all positive target values.
Decompressing a compressed data stream, includes VLC decoding, reordering back the differences obtained by decoding, prediction starting with Yn, Un, Vn as explained above, and calculating decompressed pixel values from the differences and the predicted values, wherein prediction for given pixels may require decompressed pixel values of neighboring pixels. In case of truncating LSBs prior to VLC encoding pixel information is lost. According to one embodiment of the invention, the truncated LSBs are added on a pseudo ran- dom basis during the decompression process.
Instead of VLC coding a fixed length coding using LSB truncation may be used for mapping reordered difference values, or target values, D to codes. Fig. 17 shows an example for assigning bit lengths shorter than 8 bits to the pixels of the different blocks. The average bit length in the example is somewhat higher than 5, wherein pixels having a bit lengths of 6, i.e. higher than 5, are Y pixels. This is due to the fact, that errors which result from truncation are rather visible for Y pixel than for U or V pixels, so that the truncation error for Y pixels is tried to be kept as small as possible.
Summarizing the above, the present invention relates to an efficient video bit stream compression and decompression that applies a new compression and decompression method, and relates to an apparatus for reducing the data rate of the compressed blocks of image. These compressed blocks may be used as reference for other non-intra type blocks of image in motion compensation.
The invention applies the following main new concept to achieve low bit rate of storing the decompression reference frame/block:
- Calculation of the differential value of adjacent pixels by applying horizontal and vertical prediction and both direction prediction,
- Prediction of the divider value for achieving a shortest code for a VLC coding,
- Multiple encoding and decoding engines are applied with filling the bit stream at a time to achieve highest throughput in each clock cycle time. According to the method for compression, the complexity and quality of the uncompressed block of pixels is analyzed (by way of test encoding) and a decision on the compression mode (in particular coding parameters of the VLC encoding) to be applied is made based on the analysis.
According to one embodiment of the present invention, the differential values of continuous adjacent pixels within a block, mostly likely the 1st row of a block will be calculated for VLC (Variable length) coding. According to another embodiment of the present invention, the differential values of continuous adja- cent pixels within a block, mostly likely the 1st column of a block will be calculated for VLC (Variable length) coding. According to yet another embodiment of the present invention, the differential values of upper pixel and left pixels within a block will be calculated. The smaller one will be coded by VLC (Variable length) coding.
According to a further embodiment of the present invention, the differential values of adjacent pixels will be adjusted to be all positive which reduces the bit requirement from 9 bits down to 8 bits.
According to another embodiment of the present invention, each color component (Y, U and V) is compressed and decompressed separately with the final data rate combined together to fit a predetermined bit rate.
According to another embodiment of the present invention, each color component (Y, U and V) has one pixel as the reference for the rest of other pixels.
According to another embodiment of the present invention, variable clock cycle time will have different encoder and decoder encoding and decoding different pixels. According to another embodiment of the present invention, some encoders and decoders will encode and decode more than one color component in different cycle time.
According to another embodiment of the present invention, multiple encoders and decoders are implemented to pipelining encode and decode multiple pixels in each cycle time which speeds up the performance and providing higher throughput.
According to another embodiment of the present invention, the adjusted differential value, Dn of adjacent pixels will be coded by divding the Dn by a predicted divider to achieve a result of Quotient (Q) and a Remainder (R).
According to another embodiment of the present invention, the Quotient and Remainder are separated by inserting one "1" and the value of Q and R turn out to be number of "Os" (Example: 17 = 3 x 5 + 2, Q=3, R=2, the code will be "000100")
According to another embodiment of the present invention, the divider is predicted without inserting bits into represent which reduces the bit rate.
According to one embodiment of the present invention, a predetermined number of dividers are applied and tested to determine which reaches the lowest data rate which will be selected as the divider for coding the pre- dieted differential values of pixels.
According to one embodiment of present invention, a code (Ex. "00000000" is inserted to represent that the Remainder of its following 8 bits is the same value of the predicted Dn to be coded which limited the max. length of code to be 16 bits.
The compression and decompression method may be used in video stream decoding, with the decoding comprising Reconstructing the non-B-type frame or macro-block by standard video decoding procedures;
Partitioning the reconstructed block of pixels into smaller block of less pixels and compress them and save into a temporary storage device as referencing image; and
Accessing the compressed block of corresponding pixels in the temporarily storage device and reconstruct them into pixels as the referencing image for motion compensation to recover other pixels.
The invention also relates to:
A method for compressing block of pixels, comprising:
calculating the difference between neighboring pixels;
determining the value of divider which achieve shortest code for representing the differential value of the neighboring pixels;
calculating and predicting the mode of variable length coding for each pixel with a certain block; and
applying the predicted VLC coding mode to reduce the amount of bit of representing the block pixels' difference patterns.
A method for compressing and decompressing block of pixels with predetermined bit rate of compressed block of pixels, the method comprising dur- ing compression:
calculating the difference between neighboring pixels; compressing at least two pixels at the same determined time slot; and
saving at least two compressed pixels into a storage device of predetermined length starting from at least two locations with predetermined directional path locations each cycle time;
and the method comprising during decompression:
decoding the value of divider of the block pixels and parsing and decompress- ing at least two Remainders and two Quotients at a time by using the predetermined variable length decoding method;
calculating the difference between neighboring pixels;
compressing at least two pixels at the same determined time slot; and
saving at least two compressed pixels into a storage device of predetermined length starting from at least two locations with predetermined directional path locations each cycle time.
An apparatus for decoding a video stream, comprising:
a first engine decompressing the video stream of non-B-type frame block by block into a first video format;
a first compression engine reduces the bits stream with the first video format to a predetermined length of bits and stores into a storage device for other pictures' reference;
a second decompression engine recovering the block pixels accessed from the compressed referencing image buffer;
another decompression engine reconstructing the block pixels of the com- pressed video stream of B-type frame or macroblock; and
a motion compensation engine adding the decompressed block pixels and the recovered block pixels of the reference image and reconstructing the block pix- els.

Claims

1. A method for compressing a block of pixels, comprising: calculating a prediction value for each pixel of a group of pixels of the block, wherein the prediction value for a given pixel is calculated dependent on pixel values of at least one neighboring pixel of the given pixel, calculating the difference between the pixel value and the prediction value for each pixel of the group, variable length encoding the calculated differences, wherein a same set of coding parameters of the variable length encoding is used for encoding each difference of the block.
2. The method of claim 1 , wherein determining the coding parameters of the variable length coding comprises: test encoding the differences using a first variable length encoding with a first given testing set of coding parameters for obtaining a first group of codes, counting the overall number of bits of the codes of the first group for obtaining a first number of bits, test encoding the differences using at least one further variable length encoding with a further given testing set of coding parameters for obtaining at least one further group of codes, counting the overall number of bits of the codes of the at least one further group for obtaining at least one further number of bits, comparing the first number and the at least one further number, and se- lecting the first testing set or the at least one further testing as set of coding parameters for variable length encoding.
3. The method of claim 2, wherein the first and the least one further test encoding is performed in parallel.
4. The method of any of the preceding claims, wherein prediction values are calculated for each but one pixel of the block.
5. The method of claim 4, wherein the prediction value for a given pixel of the block is
L, if the given pixel is in a first row of the block
U, if the given pixel is in a first column of the block F, otherwise, where
F = min(U.L) if S ≥ max(U,L)
F = max(U.L) if S < min(U,L)
F = U + L - S otherwise, and where
U is the pixel value of a upper neighbor pixel of the given pixel,
L is the pixel value of a left neighbor pixel of the given pixel, and
S is the pixel value of a left and upper neighbor pixel of the given pixel.
6. The method of claim 5, wherein no prediction value is calculated for a given pixel that is in the first row and in the first column of the block.
7. The method of any of the preceding claims, wherein variable length of a given difference comprises: dividing the difference by a value (m) to obtain an integer (Q) and a remainder (R), mapping the integer and the remainder to a code.
8. The method of claim 7, wherein (m) is one of the coding parameters.
9. The method of any of the preceding claims, wherein the differences prior to variable length encoding are mapped to positive values.
10. The method of claim 9, wherein differences having the same absolute value but different signs are mapped to adjacent positive values.
11. The method of any of the preceding claims, wherein the differences are represented by binary words, and wherein a number of less significant bits of these binary words are truncated prior to variable length encoding.
12. The method of claim 11 , wherein the number of truncated bits is one of the coding parameters.
13. The method of any of the preceding claims, wherein the codes obtained by variable length encoding the differences of one block is stored in a storage device, wherein pixel values of at least one pixel, for which no difference has been calculated is stored in the storage device, and wherein the coding parameters used for said block is stored in the storage device.
14. A method for decompressing a block of pixels which has been compressed using a method according to one of claims 1 to 13, comprising: variable length decoding for obtaining differences, calculating prediction values starting with pixels of the block which are neighbors to the at least one pixel for which no difference has been calculated during compression, calculating pixel values using the prediction values and the differences.
15. The method of claim 14 for decompressing a block of pixels which has been compressed using a method according to claim 11 , wherein less significant bits that have been truncated are pseudo randomly added after variable length decoding.
16. Use of the methods according to any of the preceding claims for com- pressing and decompressing frames in video stream decoding.
17. An apparatus for compressing a block of pixels, comprising: means for calculating a prediction value for each pixel of a group of pixels of the block, wherein the prediction value for a given pixel is calculated dependent on pixel values of at least one neighboring pixel of the given pixel, means for calculating the difference between the pixel value and the prediction value for each pixel of the group, means for variable length coding the calculated differences, wherein a same set of coding parameters of the variable length coding is used for coding each difference of the block.
PCT/EP2007/002164 2006-03-10 2007-03-12 Method and apparatus for blockwise compression and decompression for digital video stream decoding WO2007104524A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
EP07723201A EP1994762A1 (en) 2006-03-10 2007-03-12 Method and apparatus for blockwise compression and decompression for digital video stream decoding

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
EP06004987.1 2006-03-10
EP06004987 2006-03-10

Publications (1)

Publication Number Publication Date
WO2007104524A1 true WO2007104524A1 (en) 2007-09-20

Family

ID=38117055

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2007/002164 WO2007104524A1 (en) 2006-03-10 2007-03-12 Method and apparatus for blockwise compression and decompression for digital video stream decoding

Country Status (2)

Country Link
EP (1) EP1994762A1 (en)
WO (1) WO2007104524A1 (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5528628A (en) * 1994-11-26 1996-06-18 Samsung Electronics Co., Ltd. Apparatus for variable-length coding and variable-length-decoding using a plurality of Huffman coding tables

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5528628A (en) * 1994-11-26 1996-06-18 Samsung Electronics Co., Ltd. Apparatus for variable-length coding and variable-length-decoding using a plurality of Huffman coding tables

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
HUANG Y-W ET AL: "ANALYSIS, FAST ALGORITHM, AND VLSI ARCHITECTURE DESIGN FOR H.264/AVC INTRA FRAME CODER", IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, IEEE SERVICE CENTER, PISCATAWAY, NJ, US, vol. 15, no. 3, March 2005 (2005-03-01), pages 378 - 401, XP001225785, ISSN: 1051-8215 *

Also Published As

Publication number Publication date
EP1994762A1 (en) 2008-11-26

Similar Documents

Publication Publication Date Title
US10142654B2 (en) Method for encoding/decoding video by oblong intra prediction
US7466867B2 (en) Method and apparatus for image compression and decompression
US6438168B2 (en) Bandwidth scaling of a compressed video stream
EP1853069B1 (en) Decoding macroblock type and coded block pattern information
US8503521B2 (en) Method of digital video reference frame compression
KR100578049B1 (en) Methods and apparatus for predicting intra-macroblock DC and AC coefficients for interlaced digital video
EP1528813B1 (en) Improved video coding using adaptive coding of block parameters for coded/uncoded blocks
US7324595B2 (en) Method and/or apparatus for reducing the complexity of non-reference frame encoding using selective reconstruction
KR100984612B1 (en) Global motion compensation for video pictures
US20070217702A1 (en) Method and apparatus for decoding digital video stream
EP1383339A1 (en) Memory management method for video sequence motion estimation and compensation
US20050089098A1 (en) Data processing apparatus and method and encoding device of same
US20080165859A1 (en) Method of digital video frame buffer compression
US8326060B2 (en) Video decoding method and video decoder based on motion-vector data and transform coefficients data
US20020159526A1 (en) Video encoder and video recording apparatus provided with such a video encoder
US20060227874A1 (en) System, method, and apparatus for DC coefficient transformation
WO2007104524A1 (en) Method and apparatus for blockwise compression and decompression for digital video stream decoding
EP1298937A1 (en) Video encoding or decoding using recompression of reference frames
US20130170565A1 (en) Motion Estimation Complexity Reduction
US20060222065A1 (en) System and method for improving video data compression by varying quantization bits based on region within picture
CN116405694A (en) Encoding and decoding method, device and equipment
Shoham et al. Introduction to video compression
KR20020095260A (en) Video encoder and recording apparatus
Farag et al. New technique for motion estimation to be used in MPEG systems

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 07723201

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2007723201

Country of ref document: EP

NENP Non-entry into the national phase

Ref country code: DE