Professor Hans Knutsson
Medical Informatics Group,
Department of Biomedical Engineering,
Linköping University, Sweden. Short CV.
Manifolds and Images: Signal processing goes round the bend
Signal and image processing aims at solving real world practical
problems dealing with uncertain noisy signals generated by physical
sensors. Traditional signal processing is based on a well-established
theory developed for scalar signals in a statistical framework. A
substantial part of the theory involves convolution operations.
Smoothing and edge detection are simple examples of standard
procedures in traditional image processing. Over the years there has
been a need to generalize these methods, originally developed for
monochrome 2-D images, to multi-dimensional scalar, vector and tensor
Today some researchers are beginning to realize that the developed
theoretical framework is still to narrow to handle a number of
important problems. The fundamental limitation is that the theory is
based on a Cartesian assumption that may not be valid.
There are two typical situations where a Cartesian assumption can lead
to severe errors: 1. The data sampling grid is non-Cartesian in
space-time, e.g. ultrasound images or samples taken from curved
surfaces. 2. The appropriate space for representing the sample values
is a curved subspace (i.e. not a vector space) embedded in the measured
parameter space, e.g. cyclic features, like hue and phase, or
The natural mathematical tools here is differential geometry, tensor
analysis and the theory of smooth Riemannian manifolds. A manifold is
a generalization of a surface in R3 where the properties can be
defined without reference to an ambient space. Recent work reflect a
growing interest in this topic, but a unified framework is still
missing. We present a number of examples from our work, ranging from
non-rigid registration (the Morphon) to manifold learning (the Sample
LogMap), indicating the value of establishing the necessary steps for
adapting traditional multi-dimensional signal processing methods to
comprise general class of signals on manifolds, i.e why signal
processing needs to ``go round the bend''.
Professor Mubarak Shah
Agere Chair Professor, Computer Vision Lab., School of Electrical Engineering and Computer Science, University of Central Florida, USA. Short biography.
Recognizing Actions, Objects, and
Actions as Objects
Recognition of human actions from video sequences is a very popular problem in Computer Vision. Since an action takes place in 3-D, and is projected on a sequence of 2-D images, the projected 2-D motion may vary depending on the viewpoint of the camera. This creates a problem in recognizing human actions from 2D video sequence. In most current works on action recognition, the issue of view-invariance has been ignored. In this talk, I will present our work on human action recognition which uses geometry to deal with the problem of view invariance.
Object recognition is a classic problem in computer vision, which has been popular in the community for the last thirty years. In the second part of my talk, I will present a novel multi-view generic object class recognition method based on 3D object modeling. Instead of using a complicated mechanism for relating multiple 2D training views, the proposed method establishes spatial connections between these views by attaching appearance features to the surfaces of 3D models. The 3D model is represented by a volume consisting of binary slices, and is generated by using a new homographic framework.
When an actor performs an action in 3D, the points on the outer boundary of the actor are projected as 2D (x, y) contour in the image plane. A sequence of such 2D contours with respect to time generates a spatiotemporal volume (STV) in (x, y, t), which can be treated as 3D object in the (x, y, t) space. In the third part of this talk, I will present our approach for human recognition by treating actions as objects. We analyze STV by using the differential geometric surface properties, such as peaks, pits, valleys and ridges, which are important action descriptors capturing both spatial and temporal properties. A set of motion descriptors for a given action is called an action sketch. The action descriptors are related to various types of motions and object deformations.