Reducing the amount of data needed to reproduce video saves storage space, increases access speed and is the only way to achieve motion video on digital computers. This document looks at digital video and explains some techniques of reducing the storage space needed.
After looking at the Moving Pictures Experts Group (MPEG) standard, an objective look at some of its many competitors in the same market place is looked at. Some applications of digital video are presented.
Digital video is not without its problems, many of which are shared by all digital medium. These problems are discussed to some length.
A great deal of research has gone into image and video compression and indeed it is quite difficult to invent something new in this field. A diagram showing the many compression techniques are shown in figure 2.1. The assumption is that the input is always a PCM digitised signal in colour components. The output of the compression process is a bitstream. Lets consider each technique briefly :
Simple Compression Techniques Various techniques exist, including :
- Truncation Reduces data by reducing the number of bits per pixel. This suffers from contouring (resolution loss) but has the advantage that processing is simple.
- Colour Lookup Table (CLUT) Pixel values represent an index into a table of colours. The processing for this is non-trivial.
- Run length coding Blocks of repeated pixels are replaced with a pixel value and a count. This works well on images with blocks of single colours and can achieve a high compression ratio. However it not effective if images contain no repetitive areas.
This technique aims to send a subset of the pixels and use interpolation to reconstruct the intervening pixels. This technique is particularly useful for motion sequences, as certain frames are compressed by still compression ; the frames between these are compressed by doing an interpolation between the other frames and sending only the data needed to correct the interpolation.
This relies on the fact that there is nearly always some redundancy between frames in a sequence. There are two common methods :
Transform Coding Techniques
- DPCM (Differential Pulse Code Modulation) This operates at the pixel level and sends only the difference between successive pixels. Since there is likely to be very little difference between adjacent pixels we can encode the value into smaller data widths. This technique suffers from slope-overload which causes smearing at high contrast edges in an image.
- ADPCM (Adaptive DPCM) This tries to reduces the slope-overload by using smaller steps for the difference values.
A transform is a process that converts data into an alternate form which is more convenient for some particular purpose. Transforms are ordinarily designed to be reversible. Useful transforms typically operate on large blocks of data and perform some complex calculations. In general transform coding becomes more useful with larger blocks. The Discrete Cosine Transform (DCT) is especially important for video compression.
The DCT The DCT is performed on a block of horizontally and vertically adjacent pixels (typically an 8 by 8 block of pixels). The outputs represent amplitudes of two dimensional spatial frequency components. These are called DCT coefficients. The coefficient for zero spatial frequency is called the DC coefficient and it is the average value of all the pixels in the block. The rest of the coefficients represent progressively higher horizontal and vertical spatial frequencies in the block.Statistical Coding (or Entropy Coding) This takes advantage of the statistical distribution of the pixel values. Some data values can occur more frequently then others and therefore we can set up a coding technique that use less bits for these values. One widely used form of this coding is Huffman encoding. This technique has the overhead that a syntax has to be pre-defined or sent for the decoder to work.
Since adjacent pixel values tend to be similar or vary slowly from one to another, the DCT processing provides opportunity for compression by forcing most of the energy into lower spatial frequency components. In most cases, many of the higher frequency coefficients will have zero or new-zero values and therefore can be ignored.
The decoder performs the reverse process, but due to the transcendental nature of the DCT the reverse process can only be approximated and hence some loss takes place. The trick is to use some cunning methods of keeping coefficients so that the loss is minimally visible.
Compression is needed to simply reduce the amount of space that video would otherwise take to store. There are many factors to consider when choosing a compression technique :
- Real-Time / Non-Real-Time This refers to capturing, compressing, decompressing and playing back all in real time with no delays. The requirement is to have sufficient frame rate (frames per second)to make sure that there is no jerky motion.
- Symmetrical / Asymmetrical Symmetrical implies capturing, storing, and playback at the same rate. Asymmetrical uses more time to compress and hence may have an advantage for playback speed.
- Compression Ratio The compression ratio relates the numerical representation of the original video in comparison to the compressed video. Generally the higher the compression ratio the poorer the video quality.
- Lossless / Lossy The loss factor determines whether there is a loss of quality between the original image and the image after it has been compressed and played back (decompressed). Again this is affected by the amount of compression.
- Inter-frame / Intra-frame Inter-frame compresses each frame of the sequence as a discrete picture. Intra-frame is a more powerful method which uses a predictive technique.
- There is no copy from copy loss
- Picture does not get fuzzy
- Signal-to-Noise ratio goes down slowly
- Editing, storage and retrieval is simpler, quicker and cheaper
- Frame Rate How many frames are displayed per second, also the method of frame display : progressive - each line of video is shown one after the other; interlaced - odd lines (fields) are shown then the even fields.
- Colour Resolution This refers to the number of colours displayed at any one time. There are also various colour formats : RGB and YUV are two common formats. Colour depth is the maximum number of colours displayed.
- Spatial Resolution This deals with the size of the picture.
- Image quality Does the final sequence match the requirements of the application.
Digital video has many and varied applications, here we briefly look at some applications. The number of applications is growing rapidly as the need for compression and digital transmission grows.
- High Definition Television (HDTV)
HDTV is defined as having twice the horizontal and vertical resolution of conventional television, a 16:9 picture ratio and at least 24 frames per second. Using this definition, HDTV has approximately double the number of lines of current broadcast television. This combined with the resolution increase means that 6 times more bandwidth is needed for transmission. This is an ideal place for compression, as this will reduce the data rate and hence the bandwidth.
This is the number one application for digital video. This application includes video kiosks, training, corporate presentations and video libraries. The advantages of using digital video (and particularly MPEG) are :
Multimedia used in student training has also been shown to improve achievement by an average of 38 percent.
- Footage can be updated or changed with ease
- MPEG has network capabilities which means the presentation can be distributed
- Digital video adds a whole new dimension to presentation. Moving pictures can be incorporated into computer presentations with ease.
Since digital video clips are stored in files, they can be easily integrated into many databases just like text or numeric fields. For example, a travel agency can keep video clips of their holiday locations as well as more mundane information and really show what it is like to go for a holiday in a particular resort.