University of Surrey

Test tubes in the lab Research in the ATI Dance Research

Multi-modal Visual Data Registration for Web-based Visualisation in Media Production

Kim, Hansung, Evans, A, Blat, J and Hilton, Adrian (2016) Multi-modal Visual Data Registration for Web-based Visualisation in Media Production IEEE Transactions on Circuits and Systems for Video Technology, 28 (4). pp. 863-877.

FINAL VERSION.pdf - Accepted version Manuscript
Available under License : See the attached licence file.

Download (21MB) | Preview
Text (licence)
Available under License : See the attached licence file.

Download (33kB) | Preview


Recent developments of video and sensing technology can lead to large amounts of digital media data. Current media production rely on both video from the principal camera together with a wide variety of heterogeneous source of supporting data (photos, LiDAR point clouds, witness video camera, HDRI and depth imagery). Registration of visual data acquired from various 2D and 3D sensing modalities is challenging because current matching and registration methods are not appropriate due to differences in formats and noise types of multi-modal data. A combined 2D/3D visualisation of this registered data allows an integrated overview of the entire dataset. For such a visualisation a web-based context presents several advantages. In this paper we propose a unified framework for registration and visualisation of this type of visual media data. A new feature description and matching method is proposed, adaptively considering local geometry, semi-global geometry and colour information in the scene for more robust registration. The resulting registered 2D/3D multi-modal visual data is too large to be downloaded and viewed directly via the web browser while maintaining an acceptable user experience. Thus, we employ hierarchical techniques for compression and restructuring to enable efficient transmission and visualisation over the web, leading to interactive visualisation as registered point clouds, 2D images, and videos in the browser, improving on the current state of the art techniques for web-based visualisation of big media data. This is the first unified 3D web-based visualisation of multi-modal visual media production datasets. The proposed pipeline is tested on big multimodal dataset typical of film and broadcast production which are made publicly available. The proposed feature description method shows two times higher precision of feature matching and more stable registration performance than existing 3D feature descriptors.

Item Type: Article
Subjects : Electronic Engineering
Divisions : Faculty of Engineering and Physical Sciences > Electronic Engineering
Authors :
Evans, A
Blat, J
Date : 21 December 2016
DOI : 10.1109/TCSVT.2016.2642825
Copyright Disclaimer : Copyright 2016 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other works.
Uncontrolled Keywords : Multi-modal visual data processing, 2D-3D registraiton, 3D feature descriptors, 3D feature matching, Progressive rendering, WebGL visualisation.
Related URLs :
Depositing User : Symplectic Elements
Date Deposited : 16 Dec 2016 13:57
Last Modified : 21 Jun 2018 10:35

Actions (login required)

View Item View Item


Downloads per month over past year

Information about this web site

© The University of Surrey, Guildford, Surrey, GU2 7XH, United Kingdom.
+44 (0)1483 300800