Charles Duncan Malleson

Research Fellow B, School of Computer Science & Electronic Engineering, Faculty of Engineering and Physical Sciences, University of Surrey

Conference presentation Open access Peer reviewed

Joint Demosaicing and Chromatic Aberration Correction of Images Using Neural Networks.

by Charles Malleson and Adrian Hilton

Availability date 24/01/2020

CVMP 2019

Conference on Visual Media Production 2019

Typical colour digital cameras have a single sensor with a colour filter array (CFA), each pixel capturing a single channel (red, green or blue). A full RGB colour output image is generated by demosaicing (DM), i.e. interpolating to infer the two unobserved channels for each pixel. The DM approach used can have a significant effect on the quality of the output image, particularly in the presence of common imaging artifacts such as chromatic aberration (CA). Small differences in the focal length for each channel (lateral CA) and the inability of the lens to bring all three channels simultaneously into focus (longitudinal CA) can cause objectionable colour fringing artifacts in edge regions. These artifacts can be particularly severe when using low-cost lenses. We propose to use a set of simple neural networks to learn to jointly perform DM and CA correction, producing high quality colour images subject to severe CA as well as image noise. The proposed neural network-based joint DM and CA correction produces a significant improvement in image quality metrics (PSNR and SSIM) compared the baseline edge-directed linear interpolation approach preserving image detail and reducing objectionable false colour and comb artifacts. The approach can be applied in the production of high quality images and video from machine vision cameras with low cost lenses, thus extending the viability of such hardware to visual media production.

Conference presentation Open access Peer reviewed

Real-time Full-Body Motion Capture from Video and IMUs

by Charles Malleson, Marco Volino, Andrew Gilbert, Matthew Trumble, John Collomosse and Adrian Hilton

Published 12/10/2017

3DV 2017 Proceedings

International Conference on Computer Vision (3DV) 2017, Shandong University, Qingdao, China

A real-time full-body motion capture system is presented which uses input from a sparse set of inertial measurement units (IMUs) along with images from two or more standard video cameras and requires no optical markers or specialized infra-red cameras. A real-time optimization-based framework is proposed which incorporates constraints from the IMUs, cameras and a prior pose model. The combination of video and IMU data allows the full 6-DOF motion to be recovered including axial rotation of limbs and drift-free global position. The approach was tested using both indoor and outdoor captured data. The results demonstrate the effectiveness of the approach for tracking a wide range of human motion in real time in unconstrained indoor/outdoor scenes.

Conference presentation Open access Peer reviewed

Total Capture: 3D Human Pose Estimation Fusing Video and Inertial Sensors

by Matthew Trumble, Andrew Gilbert, Charles Malleson, Adrian Hilton and John Collomosse

Published 07/09/2017

Proceedings of 28th British Machine Vision Conference, 1 - 13

28th British Machine Vision Conference, 04/09/2017–07/09/2017, London, UK

We present an algorithm for fusing multi-viewpoint video (MVV) with inertial measurement unit (IMU) sensor data to accurately estimate 3D human pose. A 3-D convolutional neural network is used to learn a pose embedding from volumetric probabilistic visual hull data (PVH) derived from the MVV frames. We incorporate this model within a dual stream network integrating pose embeddings derived from MVV and a forward kinematic solve of the IMU data. A temporal model (LSTM) is incorporated within both streams prior to their fusion. Hybrid pose inference using these two complementary data sources is shown to resolve ambiguities within each sensor modality, yielding improved accuracy over prior methods. A further contribution of this work is a new hybrid MVV dataset (TotalCapture) comprising video, IMU and a skeletal joint ground truth derived from a commercial motion capture system. The dataset is available online at http://cvssp.org/data/totalcapture/.

Conference presentation Open access

FaceDirector: Continuous Control of Facial Performance in Video

by Charles Malleson, Jean-Charles Bazin, Oliver Wang, Derek Bradley, Thabo Beeler, Adrian Hilton and Alexander Sorkine-Hornung

Published 05/2016

2015 IEEE International Conference on Computer Vision (ICCV), 3979 - 3987

2015 IEEE International Conference on Computer Vision (ICCV 2015), 13/12/2015–16/12/2015, Santiago, Chile

We present a method to continuously blend between multiple facial performances of an actor, which can contain different facial expressions or emotional states. As an example, given sad and angry video takes of a scene, our method empowers the movie director to specify arbitrary weighted combinations and smooth transitions between the two takes in post-production. Our contributions include (1) a robust nonlinear audio-visual synchronization technique that exploits complementary properties of audio and visual cues to automatically determine robust, dense spatiotemporal correspondences between takes, and (2) a seamless facial blending approach that provides the director full control to interpolate timing, facial expression, and local appearance, in order to generate novel performances after filming. In contrast to most previous works, our approach operates entirely in image space, avoiding the need of 3D facial reconstruction. We demonstrate that our method can synthesize visually believable performances with applications in emotion transition, performance correction, and timing control.

Conference presentation Open access Peer reviewed

Single-view RGBD-based reconstruction of dynamic human geometry

by C Malleson, M Klaudiny, A Hilton and J-Y Guillemaut

Published 2013

Proceedings of the IEEE International Conference on Computer Vision - Workshop on Dynamic Shape Capture and Analysis (4DMOD 2013), 307 - 314

IEEE International Conference on Computer Vision Workshops (ICCVW 2013), 02/12/2013–08/12/2013, Sydney, NSW

We present a method for reconstructing the geometry and appearance of indoor scenes containing dynamic human subjects using a single (optionally moving) RGBD sensor. We introduce a framework for building a representation of the articulated scene geometry as a set of piecewise rigid parts which are tracked and accumulated over time using moving voxel grids containing a signed distance representation. Data association of noisy depth measurements with body parts is achieved by online training of a prior shape model for the specific subject. A novel frame-to-frame model registration is introduced which combines iterative closest-point with additional correspondences from optical flow and prior pose constraints from noisy skeletal tracking data. We quantitatively evaluate the reconstruction and tracking performance of the approach using a synthetic animated scene. We demonstrate that the approach is capable of reconstructing mid-resolution surface models of people from low-resolution noisy data acquired from a consumer RGBD camera. © 2013 IEEE.

Charles Duncan Malleson

Research Fellow B, School of Computer Science & Electronic Engineering, Faculty of Engineering and Physical Sciences, University of Surrey

Output list