Motion Estimation and Shape Representation for Object-based Video Coding


	Motion Estimation and Shape Representation for Object-based Video Coding *Funding Body:* EPSRC (GR/L54868) *Investigators:* Dr. G.R. Martin (Principal investigator) Dr. R.A. Packwood *Objectives Summary & Results:* Evolving object-based coding standards, such as MPEG-4, introduce radically different functionalities at the expense of increased computational complexity, and there exists a demand for greater compression to permit communication over widely available low bandwidth channels, for example in mobile multimedia communications. Current developments which permit arbitrary-shaped objects to be encoded and decoded as separate video object planes (VOPs) have largely adopted existing coding techniques and then modified and optimised the methods for an object-based framework. Fixed size block matching (FSBM) has remained the preferred approach to motion estimation due to its backward compatibility with previous standards. The main aim of this project was to develop efficient motion estimation and shape coding techniques specifically designed for object-based video coding. Investigations showed that a variable size block matching (VSBM) motion compensation approach can offer significant coding efficiency improvements over FSBM, while maintaining the simplicity of implementation and computational complexity of FSBM. The technique was extended to produce a modified VSBM (MVSBM) motion compensation strategy that exploits irregularly shaped areas of uniform motion within small objects. Both VSBM and MVSBM utilise a quad-tree for the representation of the irregular motion segmentation structure, which is predictively coded and transmitted with the motion compensation information. Since the arbitrary areas are a composition of 4 x 4 blocks, the whole structure can be encoded using a quad-tree, where multiple blocks undergoing the same motion form a single area. While a motion vector per block representation would normally be very expensive to transmit, a motion vector redundancy coding (MVRC) scheme has proved ideally suited to producing compact descriptions of MV structures exhibiting high spatial redundancy. Additionally, a temporal quad-tree coding (TQC) scheme was developed that exploits any temporal redundancies between successive quad-tree structures, using a differential coding mechanism. In comparison with conventional motion information coding schemes, the combined algorithms provide bit coding reductions of up to 21% for the same PSNR. A shape coding strategy which adapts the MPEG-4 arbitrary shape coding techniques to a variable block size framework has been developed. It successfully integrates the shape and quad-tree coding requirements of VSBM and MVSBM in a unified structure while minimising temporal redundancies. The coding efficiency of the combined MVSBM motion compensation and shape coding cost is compared with the MPEG-4 motion vector and shape coding requirements. For the same quality prediction, the new scheme shows a bit coding reduction of up to 15%. The shape coding strategy was further improved, making it appropriate for small video objects undergoing fast shape changes due to either rapid object movement or camera focal length changes. Compared with conventional motion and shape coding, the new technique provides notable bit coding improvements for the same PSNR. The combined motion compensation and shape coding algorithms have been implemented in a hybrid video object codec which employs shape adaptive DCT texture coding. Evaluations on a wide selection of test sequences confirm the quoted coding efficiency gains for the same PSNR and perceptual quality of the reconstructed images. Additionally, the techniques developed are shown to be appropriate for the coding of multiple video objects, and are readily scalable. *Acknowledgements:* The investigators are pleased to acknowledge the support for this work from the UK Engineering and Physical Sciences Research Council.