Downloadable Documents and Publications

Sections:
- Public documents and presentations

- Public summaries from internal reports
- Public reports
- Academic publications

Detph enabled test data repository
As an initiative of the 3D Immersive Interactive Media (3DIIM) Cluster (www.3dmedia-cluster.eu) it has been configured a repository of depth enabled test data sequences.
Here there are some depth enabled sequences available for research and development purposes. You can use this sequences mentioning the source according the credits in every file.
Go to the repository

Useful links:
- EU funded projects
- Professional associations
- Other interesting resources

Public documents and presentations

DATE  DESCRIPTION
June 26, 2008   Title: "Open FX"
Download OpenFX document in PDF   Developing new tools for stereo imaging postproduction. Common development architecture description (OpenFX)
By Miriam Balaguer, Barcelona Media.
June 3, 2008   Title: "Beyond Stereoscopic 3D"
  Technology presentation by Ralf Tanger of HHI during the Dimension3 expo in Chalon-sur-Saone (France)
http://www.dimension3-expo.com
July 1 , 2008   Title: "Real-Time 3D-Audio for Digital Cinema"
  Presented for the Acoustics 08, held in Paris July 2008
By Pau Arumí, David García, Toni Mateos, Adan Garriga, Jaume Durany, Barcelona Media
May 1 , 2009   May 2009 Newsletter
  Project newsletter. Includes: Immersive revolution P.1 Head swap, an immersive experience P.2 Emerging standards for 3D media P.3 The 2020 3D Media Consortium P.4
May 1 , 2010   May 2010 Newsletter
  Project newsletter. Includes: Depth enhanced inpainting P.1 Dialogue with Bernard Mendiburu P.2 Cinematographic editing and the passive viewer P.3 Spatial audio P.4
August 1 , 2010   System architecture diagram and highlights
  This poster shows the proposed system architecture the demonstration scenarios and the development highlights.
April 1 , 2011   May 2011 Newsletter
  Project newsletter. Includes: New notions of spectatorship, presence and empathy P.1 Surround video polycamera system P.2 Workflow, Metadata, and Semantic DefinitionsP.3 Multi-label depth estimation for graph cuts stereo problems P.4
November 17 , 2011   Grass Valley - CVMP 2011 Presentation
  Dr. Peter Centen, Group Leader and Chief Scientist of Imaging at Grass Valley, talked about de research related with Time of Flight (ToF) cameras performed in the project
    More publishable documents (video, pictures) available upon request: info@20203dmedia.eu

Go to top

[End of public documents and presentations]

Public summaries from internal reports

DATE  DESCRIPTION
May 30, 2008   Title: "Project Hand Book and Quality Plan"
Download OpenFX document in PDF   This document is the Project Handbook for the 2020 3D Media Project

By Eugenia Fuenmayor and Elena Torres, Barcelona Media.
May 30, 2008   Title: "2020 3D Media Initial IP Strategy Document"
  This document describes the way Intellectual Property is handled inside the 2020 3D Media
consortium
By Eugenia Fuenmayor and Elena Torres, Barcelona Media.
May 30, 2008   Title: "Dissemination Stategy"
  Initial version of the plan for disseminating the activities and the generated knowledge and products of the project.
By Santi Fort, Barcelona Media.
August 22, 2008   Title: "Self-assessment Plan"
  Quality Management is a Consortium Management task, designed to assure that the project keeps on track and meets the goals.
By Tom Evans and Peter Stansfield, coordinated by Barcelona Media.
August 28, 2008   Title: "Definition of a new camera architecture"
  This document describes the changes needed in the architecture of a regular professional 2D camera to turn it into a camera that can capture scenes in 2D + depth.
By Klaas Jan Damstra, Grass Valley Netherland
August 31, 2008   Title: "Requirements and definition of new architecture for combining Viper camera with Trifocal Stereo and Structured Light Devicest"
  3D Image Capture combining tri-focal stereovision, spatio-temporal structured light and a cinematographic camera with Z-channel into a unified video-and-depth recording device.
By Ralf Tanger, Fraunhofer HHI and others
October 21, 2008   Title: "Report on magnification errors in the optical path and possible algorithms to create higher resolution out of 2K channels"
  Market trends and state-of-the-art in technology with respect to resolution, frame rates and bit depth concerning high-end cameras to be designed
By Klaas Jan Damstra, Grass Valley Netherland
February 25, 2009   Title: "Support for multi-view content in metadata standards and formats"
  This document collects the metadata elements required for the production of multi-source digital cinema content.
By Werner Bailer, JRS and others.
February 13, 2009   Title: "A new in-field recording architecture"
  Specifycation of the in-field recorder for the need of this project respectively for a general proposed future 3D acquisition approach
By Thomas Brune, DTO
February 11, 2009   Title: "Algorithm to create a higher resolution demonstrated on PC"
  Results of the implementation of the demonstrator of the offset processing
By Klaas Jan Damstra, Grass Valley Netherland
February 28, 2009   Title: "Improved algorithms for 2D-3D image conversion"
  2D-3D conversion consists in turning into 3D any 2D content.
By Philippe Gérard, Doremi.
February 25, 2009   Title: "Image inpainting methods and multi-view image processing"
  Proposal of an image inpainting method adapted to artefact correction in the context of view
synthesis.
By Vicent Caselles, Barcelona Media.
February 26, 2009   Title: "Specification of design and implementation of automated distribution system."
  This document identifies the main requirements and specifies the software functionality for automated identification and distribution of content.
By Dave Stringer, Datasat.
February 28, 2009   Title: "Requirements and Standards for Spatialised Audio play-out"
  This document collects the proposal for a spatial audio format, suitable for distribution which is
independent of the exhibition system.
By Toni Mateos, Barcelona Media.
June 6 , 2009   Title: "Market Analysis, Business Models and Exploitation Planning"
  This document gives an overview of the markets and technology involved in 3-D stereoscopic content creation and distribution, and also the business models to support the exhibition of such contents.
By Jordi Alonso , Mediapro.
May 29 , 2009   Title: "New camera architecture design"
  This report describes what has been achieved until now with respect to the design and (partly) use of the new camera architecture in the 2020 3D Media project.
October 15 , 2009   Title: "Workflow Design and Experimental Testing"
  Considerations on the design of a workflow by the experience of the production of a 3D content using multiple different cameras and tools.
July 31 , 2009   Title: "Content analysis tools for multi-source content"
  This document describes the prototype of the multi-view content analysis framework.
August 25 , 2009   Title: "Prototype recording solution prepared for enhanced resolution application"
  This paper describes a first version prototype of a new portable field recorder solution for 3D image capture.
August 26 , 2009   Title: "Initial Prototype 2K Viper Camera with Satellite Cameras"
  Hardware and software components of a trifocal depth capture prototype are described.
September 02 , 2009   Title: "Initial prototype 2K camera with structured light beamer"
  This document describes our initial prototype of a 2K resolution, real-time capable, depth capture method for cinematographic or broadcast use.
September 08 , 2009   Title: "Immersive/augmented/3D display system v1"
  Description of architecture of initial prototype for displaying immersive and 3D images.
November 27 , 2009   Title: "Prototype (2K) Viper camera with combined satellite/beamer"
  Components of a depth capture prototype combining a structured light and a trifocal approach are described.
February 28 , 2010   Title: "Semantically Driven Workflow System"
  This report presents the Semantically Driven Work-flow System prototype as derived from requirements collected, exposed and analysed.
February 26 , 2010   Title: "Extensions of MPEG-C Part3, MVC and MXF for advanced depth and 3D audio representation"
  This document summarizes the metadata specification throughout the workflows covered by 2020 3D Media.
February 26 , 2010   Title: "Higher resolution algorithm demonstrated on new camera architecture"
  A short description and an MPEG video demo on the enhanced resolution demo package.
February 26 , 2010   Title: "Plug-ins to import depth-enhanced image and auxiliary data"
  This deliverable describes an OpenFx plug-in for post-production tools such as Nuke, Mistika and others.
February 18 , 2010   Title: "Object segmentation and re-detection algorithm for multi-view content v1"
  This deliverable describes the prototypes for fulfilling the pre-processing tasks for object redetection.
February 26 , 2010   Title: "Deconvolution method for audio in real-time"
  The research described here focuses on two blind-algorithms for sound deconvolution.
February 26 , 2010   Title: "Automated distribution systems feeding Cinema and/or public spaces"
  The deliverable presents the initial protptype of a digital distribution system which operates on metadata and orders, to transfer the required files to destinations in an efficient and timely manner.
March 08 , 2010   Title: "Evaluation procedures"
  This deliverable presents the evaluation methodologies to be followed in order to gather feedback from professional and end users.

Go to top

[End of public summaries from public reports]

Public reports

DATE  DESCRIPTION
June 12, 2009   Title: "Protocols and standards for data capture"
Download OpenFX document in PDF   Outline of a target system where all three 3D techniques (tri-focal, spatio-temporal
structured light and Z-channel acquisition) are unified
By Thomas Brune, DTO
December 15, 2008   Title: "Initial set of test material of 2D-3D image conversion, set of geometric test patterns for testing the correctness of 2D-3D conversion"
  Technology behind 2D-3D conversion process, that starts by an automatic or semiautomatic
segmentation.
By Philippe Gerard, Doremi
June 1 , 2009   Title: "Emerging Standards for 3D Media"
  This report summarizes the current activities in 3D video standardisation.
By Ralf Tanger, Fraunhofer HHI
February 27, 2008   Title: "Spectator response factors, production, workflow and technology requirements for stereoscopic and immersive video"
  Foundation of research into Immersive and Stereoscopic Cinematography, gathering data about and building an understanding of the audience’s engagement and subjective perception.
By Eric Martrou, Mediapro and Kurt Vanhoutte, CREW
March 13, 2009   Title: "Surround video polycamera system"
  Reports on the 2020 3D media first year development of a omnidirectional camera system
By Philippe Bekaert, Hasselt University
March 13, 2009   Title: "Demonstrator of framework for the capture, manipulation, and display of omnidirectional video"
  Describes “janus” software framework for the capture, manipulation and display of omni-directional video
By Philippe Bekaert, Hasselt University
August 09, 2009   Title: "Workflow, Metadata, and Semantic Definitions Report"
  The document describes the overall process workflow,related actions, actors and semantic relations.
By Atta Badii, Meng Zhu, David Fuschi, Werner Bailer, Georgia Efthymiopoulou
December 18, 2009   Title: "Demonstrator of 2D-3D image conversion material"
  This deliverable presents the improvement obtained in 2D-3D conversion for more complex images.
By Philippe Gérard
September 30, 2009   Title: "Project Progress Presentation"
  This document describes what has been done for the dissemination of the project during IBC 2009.
By Santi Fort
February 26, 2010   Title: "Cinematography of surround video, assuming a passive spectator"
  Direct translation of classical cinematographic parameters into immersive media proves to be problematic.
By Kurt Vanhoutte, Eric Joris, Brecht Debackere, Nele Wynants
February 23, 2010   Title: "Metadata Representation for 3D Cinema Production"
  This document summarizes the metadata specification throughout the workflows covered by 2020 3D Media, their mapping to formats and embedding into containers, where applicable.
By Werner Bailer (JRS), Philippe Bekaert (UoH), Klaas Jan Damstra (GVN), Jan van Rooy (GVN), Stephen Field (DTS), Dave Stringer (Datasat), David Fuschi (UoR)
February 26, 2010   Title: "Depth enhanced inpainting model and automatic detection of occluded/disoccluded regions"
  This deliverable addresses the problem of inpainting depth-enhanced images.
By Pau Gargallo, Alexandre Hervieu, and Vicent Caselles
February 26, 2010   Title: "Spatial Audio Playback for Realistic Cinema Environments"
  This reports contains a description of the tests at a real environment, in the DTS theatre located near London of the BM-audio produced technology for encoding and decoding 3D audio soundtracks.
By Pau Arumí, Giulio Cengarle, Toni Mateos

Go to top

[End of public reports]

Academic publications

DATE  DESCRIPTION
October16, 2008   Title: "Being inside the image. Heightening the sense of presence in a video captured environment through artistic means: the case of CREW."
Paper to download   This paper explores the use of omni-directional video, a high impact immersive medium, by a performance group and multi-disciplinary team of artists and scientists.
By Nele Wynants, Kurt Vanhoutte, Theatre, Film and Literature Studies, University of Antwerpen and Philippe Bekaert , Expertise Centre for Digital Media, University of Hasselt
September 12 , 2008   Title: "Cognitive immersion "
Paper to download   The goal of this paper is to present our research about the cognitive processes involved in 3D immersion and present our research plans about how this type of environment should be helpful for people with special needs, in particular people with cerebral paralysis.
By Elena Parra and Raquel Navarro, BM
June 22 , 2008   Title: "Video Enhancement Using Reference Photographs"
Paper to download   Handheld digital video cameras have become increasingly popular and cheaper in recent years. Even still cameras offer additional functionality for shooting videos. Unfortunately, the small form factor of these devices limits the light ensitivity, and often the lens and sensor do not allow for satisfactory image quality. We wish to process a video into a more aesthetically pleasing version, by borrowing information from high quality reference photographs of the same scene. Since the process of taking a photograph is not time-critical, we can afford a longer exposure for reducing noise, and record more information to increase resolution.
By Cosmin Ancuti, Tom Haber, Tom Mertens, Philippe Bekaert, Hasselt University
June 22 , 2008   Title: "Eunomia: Toward a Framework for Multi-touch Information Displays in Public Spaces"
Paper to download   Multi-touch interaction techniques are becoming more widespread because of new industrial initiatives to make this hardware available and affordable for the consumer market. To cope with the diversity in hardware setups and the lack of knowledge about developing generic multitouch applications, we created a framework, Eunomia, for abstracting the hardware from the software and to enable software developers to easily develop interactive applications taking advantage of multi-touch interaction.
By Tom Cuypers, Jan Schneider, Johannes Taelman, Kris Luyten, Philippe Bekaert, Hasselt University
June 26 , 2008   Title: "IT leverage for media acquisition: New paradigms in the key area of digital cinematography and HD production workflows"
Paper to download   This paper proposes the demonstration of a comprehensive novel acquisition infrastructure based on standard 10G Ethernet interface technology. This infrastructure is suitable for digital cinematography as well as HD production workflows and comprises camera, mobile solid state recorder and monitor all connected through a 10G Ethernet based network. Optical Ethernet technology enables bidirectional, high speed, long distance networking at affordable cost as conventional IT equipment - such as switches - are fully supported. Interface and file formats enable versatile metadata support and provide a back channel to the camera for control purposes. Advances in solid state storage capacity in combination with interface performance eliminate any need for costly and quality compromising compression.
By Brune, T. Kochale, A. Wittenburg, J.P, Technicolor
January 1 , 2009   Title: "Time-triggered Static Schedulable Dataflows for Multimedia Systems"
Paper to download   Software-based reactive multimedia computation systems are pervasive today in desktops but also in mobile and ultra-portable devices. Most such systems offer a callback-based architecture to incorporate specific stream processing. Article presented at the Electronic Imaging - Multimedia Networking and Computing conference, in January 2009, in San Jose, California
By Pau Arumí, BM and Xavier Amatriain, Telefonica Research
May 1, 2009   Title: "3D-Audio with CLAM and Blender's Game Engine"
Paper to download   Blender can be used as a 3D scene modeler, editor, animator, renderer and Game Engine. This paper describes how it can be linked to a 3D sound plat form working within the CLAM framework, making special emphasis on a specific application: the recently launched Yo Frankie! open content game for the Blender Game Engine. The game was hacked to interact with CLAM, implementing spatial scene descriptors transmited over the Open Sound Control protocol, and allowing to experiment with many different spatialization and acoustic simulation algorithms. Further new applications of this Blender- CLAM integration are also discussed.
By Natanael Olaiz, Pau Arumí, Toni Mateos and David Garcia, BM
March , 2009   Title: "A Framework for Networked Interactive Surfaces"
Paper to download   The development of interactive surfaces has led to a number of applications as it allows for natural interaction and collaboration. The use of co-located collaboration on this kind of surfaces provides new possibilities. It is our belief, however, that an even higher degree of collaboration can be achieved by overcoming the boundaries of a single device or setup. Therefore, we extended our previously built multi-touch framework to realize collaborative multi-device setups. In order to assess the framework, a networked virtual world and three other applications were built to demonstrate its power.
By Tom Cuypers, Karel Frederix, Chris Raymaekers, Philippe Bekaert, Univ. of Hasselt.
March 23 , 2009   Title: "Estimating 3D camera motion for rendering audio in virtual scenes"
Paper to download   In the production phase of digital cinema content it is important to allow the director not only to preview the final rendered scene early in the shooting process, but also to prehear the 5.1 surround and HRTF-based binaural versions, thus enabling visual and auditive artistic decisions to be taken at the shooting stage. In many cases camera tracking data is not available for all cameras on the set (e.g. handheld ones) and thus the motion of the camera needs to be estimated. In this paper we describe an approach for estimation of 3D camera motion and its use for real-time audio rendering.
By Werner Bailer, JRS, and Pau Arumí, Toni Mateos, Adan Garriga, Jaume Durany, David García, BM.
May , 2009   Title: ""Summarizing Raw Video Material Using Hidden Markov Models"
Paper to download   Besides the reduction of redundancy the selection of representative segments is a core problem when summarizing collections of raw video material. We propose a novel approach for the selection of segments to be included in a video summary based on Hidden Markov Models (HMM), which are trained on an annotated subset of the content. The observations of the HMM are relevance judgments of content segments based on different visual features, the hidden states are the selection/non-selection of content segments. The HMM is designed to take all relevant scenes into account. We show that the approach generalizes well when trained on sufficiently diverse content
By Werner Bailer and Georg Thallinger, JRS,
June 20 , 2009   Title: "Depth from sliding projections"
Paper to download   In this paper we present a novel method for 3D structure acquisition, based on structured light. Unlike classical structured light methods, in which a static projector illuminates a scene with time-varying illumination patterns, our technique makes use of a moving projector emitting a static striped illumination pattern. This projector is translated at a constant velocity, in the direction of the projector's horizontal axis. Illuminating the object in this manner allows us to perform a per pixel analysis, in which we decompose the recorded illumination sequence into a corresponding set of frequency components. The dominant frequency in this set can be directly converted into a corresponding depth value. This per pixel analysis allows us to preserve sharp edges in the depth image. Unlike classical structured light methods, the quality of our results is not limited by projector or camera resolution, but is solely dependent on the temporal sampling density of the captured image sequence. Additional benefits include a significant robustness against common problems encountered with structured light methods, such as occlusions, specular reflections, subsurface scattering, interreflections, and to a certain extent projector defocus.
By Tom Cuypers, Karel Frederix, Chris Raymaekers, Philippe Bekaert, Univ. of Hasselt.
June 20 , 2009   Title: "Capturing Multiple Illuminations using Time and Color Multiplexing"
Paper to download   Many vision and graphics problems such as relighting, structured light scanning and photometric stereo, need images of a scene under a number of different illumination conditions. It is typically assumed that the scene is static. To extend such methods to dynamic scenes, dense optical flow can be used to register adjacent frames. This registration becomes inaccurate if the frame rate is too low with respect to the degree of movement in the scenes. We present a general method that extends time multiplexing with color multiplexing in order to better handle dynamic scenes. Our method allows for packing more illumination information into a single frame, thereby reducing the number of required frames over which optical flow must be computed. Moreover, color-multiplexed frames lend themselves better to reliably computing optical flow. We show that our method produces better results compared to time multiplexing alone. We demonstrate its application to relighting, structured light scanning and photometric stereo in dynamic scenes..
By Bert De Decker, Jan Kautz, Tom Mertens and Philippe Bekaert, Univ. of Hasselt.
July , 2009   Title: "Methodology for studying the numerical speed of sound in finite differences"
Paper to download   The space and time discretization inherent to all FDTD schemes introduce non-physical dispersion errors, i.e. deviations of the speed of sound from the theoretical value predicted by the governing Euler differential equations. A generalmethodology for computing this dispersion error via straightforward numerical simulations of the FDTD schemes is presented. The method is shown to provide remarkable accuracies of the order of 1/1000 in a wide variety of twodimensional finite difference schemes.
By C. Spa, UPF, T. Mateos, A. Garriga, BM
September , 2009   Title: "Shadow Multiplexing for Real-Time Silhouette Extraction"
Paper to download   Automatic detection and tracking of feature points is an important part of many computer vision methods. A widely used method is the KLT tracker proposed by Kanade, Lucas and Tomasi. This paper reports work done on porting the KLT tracker to the GPU, using the CUDA technology by NVIDIA. describing the CUDA port of the point tracker has been accepted as full paper at the Computer Graphics, Computer Vision and Mathematics (GraVisMa) workshop.
By Tom Cuypers, Yannick Francken, Johannes Taelman, Philippe Bekaert, Univ. of Hasselt
September 4 , 2009   Title: "Realtime KLT feature point tracking for High Definition video"
Paper to download   Automatic detection and tracking of feature points is an important part of many computer vision methods. A widely used method is the KLT tracker proposed by Kanade, Lucas and Tomasi. This paper reports work done on porting the KLT tracker to the GPU, using the CUDA technology by NVIDIA. describing the CUDA port of the point tracker has been accepted as full paper at the Computer Graphics, Computer Vision and Mathematics (GraVisMa) workshop.
By Hannes Fassold, Jakub Rosner, Peter Schallauer and Werner Bailer, JRS
December, 2009   Title: "Relighting Objects from Image Collections"
Paper to download   We present an approach for recovering the reflectance of a static scene with known geometry from a collection of images taken under distant, unknown illumination. In contrast to previous work, we allow the illumination to vary between the images, which greatly increases the applicability of the approach. Using an all-frequency relighting framework based on wavelets, we are able to simultaneously estimate the per-image incident illumination and the persurface point reflectance. The wavelet framework allows for incorporating various reflection models. We demonstrate the quality of our results for synthetic test cases as well as for several datasets captured under laboratory conditions. Combined with multi-view stereo reconstruction, we are even able to recover the geometry and reflectance of a scene solely using images collected from the Internet.
By Tom Haber, Christian Fuchs, Philippe Bekaert, H.P. Seidel. Univ. of Hasselt
December, 2009   Title: "Gloss and Normal Map Acquisition of Mesostructures Using Gray Codes"
Paper to download   We propose a technique for gloss and normal map acquisition of fine-scale specular surface details, or mesostructure. Our main goal is to provide an efficient, easily applicable, but sufficiently accurate method to acquire mesostructures. We therefore introduce a setup consisting of cheap and accessible components, including a regular monitor and a digital still camera. We build on our previously proposed method [Francken et al. 2008], which acquires normal maps by analyzing the reflection of binary patterns (Gray codes). These patterns are successively refined in this process. The key idea in this paper, is that this refinement also allows us to measure the shininess for each spatial location, resulting in a gloss map. Acquiring spatially varying reflectance usually requires a complicated hardware setup, which measures the BRDF at each spatial location. Our method is much simpler and cheaper. Even though we assume a very simplified BRDF model, our technique is able to reproduce the mesostructure's appearance faithfully
By Yannick Francken, Tom Cuypers, Tom Mertens, Philippe Bekaert. Univ. of Hasselt
December 9, 2009   Title: "Metadata for Creation and Distribution of Multi-view Video"
Paper to download   This paper reviews the metadata requirements for multi-view video content and analyze how well these requirements are covered in existing metadata standards, both in terms of the coverage of metadata elements and the capabilities to structurally describe multi-view video content. Workshop on Semantic Multimedia Database Technologies (SeMuDaTe)
By Bailer W., Höffernig M. JRS
December 9 , 2009   Title: "Formal Metadata Semantics for Interoperability in the Audiovisual Media Production Process"
Paper to download   Di erent metadata formats and standards are used in the stages of the process, containing metadata properties with similar and partly overlapping semantics, though not fully identical. We attempt to model the metadata properties used throughout the production process in a format independent way by creating an ontology that models these properties and the relations between them.
By Höffernig M., Bailer W, JRS
December 12 , 2009   Title: "How to Align Media Metadata Schemas? Design and Implementation of the Media Ontology"
Paper to download   Multimedia data is generated, shared, stored and distributed worldwide at an ever increasing rate. This huge amount of content comes with metadata represented in different formats which hardly interoperate although they partially overlap.
By Florian Stegmaier1, Werner Bailer2, Tobias B¨urger3, et al., JRS
November 2 , 2009   Title: "Fast GPU-based KLT feature point tracking using CUDA"
Paper to download   Automatic selection and tracking of feature points is a basic task for many CV algorithms (e. g. structure from motion, object tracking) Very popular due to good performance:
KLT algorithm [Kanade, Lucas and Tomasi]
By Florian Stegmaier1, Werner Bailer2, Tobias B¨urger3, et al., JRS
November 2 , 2009   Title: "Comparing Fact Finding Tasks and User Survey for Evaluating a Video Browsing Tool"
Paper to download   In the multimedia information retrieval community evaluations following the Cranfield paradigm (as e.g. used in TRECVID) have been widely adopted. We have applied two TRECVID style fact finding approaches (retrieval and question answering tasks) and a user survey to the evaluation of a video browsing tool.
By Werner Bailer and Herwig Rehatschek, JRS
December 17 , 2009   Title: "A Video Browsing Tool for Content Management in Post-production"
Paper to download   We propose an interactive video browsing tool for supporting content management and selection in postproduction. The approach is based on a process model for multimedia content abstraction. A software framework based on this process model and desktop and Web-based client applications are presented. International Journal of Digital Multimedia Broadcasting, 2010
By Werner Bailer, Wolfgang Weiss, Gert Kienast, Georg Thallinger, and Werner Haas, JRS
July 14 , 2009   Title: "Integrative Workflow-Embedded Support For Process Automation And Management"
Paper to download   This paper presents the on-going research performed in order to integrate process automation and process management support in the context of media production. This has been addressed on the basis of a holistic approach to software engineering applied to media production modeling to ensure design correctness, completeness and effectiveness. European and Mediterranean Conference on Information Systems 2009
By Badii A., Fuschi D., Zhu M., Univ. of Reading
November 23 , 2009   Title:"Confronto sperimentale tra microfono B-format Soundfield e sonda intensimetrica di pressione e velocità Microflown"
Paper to download   Il funzionamento dei microfoni B-Format si basa sull'elaborazione dei segnali ottenuti da una configurazione coincidente tetraedrica di capsule microfoniche direzionali, che vengono combinati linearmente per fornire in uscita quattro canali, indicati con W, X, Y e Z, proporzionali rispettivamente alla pressione sonora ed alle tre componenti cartesiane del vettore di velocità acustica. Questi segnali vengono equalizzati per compensare effetti di interferenza dovuti alla non perfetta coincidenza spaziale delle capsule. Le sonde intensimetriche di ultima generazione Microflown sono costituite da un microfono di pressione e tre trasduttori di velocità operanti in base al principio dell'anemometria a doppio filamento caldo, permettendo così una misura diretta e pressoché coincidente delle variabili acustiche e v. Per via del particolare principio di funzionamento i sensori di velocità sono caratterizzati da una curva di risposta in ampiezza e fase descrivibile con una serie di tre filtri passa-basso con diverse frequenze di taglio, che devono essere accuratamente compensati nella fase di acquisizione del segnale.
By G. Cengarle, T. Mateos and D. Bonsi., BM
April , 2010   Title: "A Narrow Band Method for the Convex Formulation of Discrete Multilabel Problems"
Paper to download   We study a narrow band type algorithm to solve a discrete formulation of the convex relaxation of energy functionals with total variation regularization and non convex data terms. We prove that this algorithm converges to a local minimum of the original non linear optimization problem. We illustrate the algorithm with experiments for disparity computation in stereo and a multi-label segmentation problem and we check experimentally that the energy of the local minimum is very near to the energy of the global minimum obtained without the narrow band type method
By A. Baeza, V. Caselles, P. Gargallo, N. Papadakis., BM
April 1 , 2010   Title: "Migrating from Process Automation to Process Management Support: a Holistic Approach to Software Engineering applied to Media Production"
Paper to download   This paper presents the on-going research performed in order to migrate from process automation to process management support in the context of media production and more specifically 3D cinematographic immersive and interactive production.
By Badii A., Fuschi D., Zhu M., Univ. of Reading
April, 2010   Title: "An improved method to determine the onset timings of reflections in an acoustic impulse response"
Paper to download   Determining the absolute onset time of reflections in an acoustic impulse response (IR) has applications for both subjective and physical acoustics problems. Although computationally simple, a first-order energetic analysis of the IR can lead to false-positive identification of reflections. This letter reports on a new method to determine reflection onset timings using a modified running local kurtosis analysis to identify regions in the IR where the distribution is non-normal. IRs from both real and virtual rooms are used to validate the new method and to find optimum analysis window sizes.
By J. Usher, BM
May, 2010   Title: "A new technology for the assisted mixing of sport events: application to live football broadcasting"
Paper to download   This paper presents a novel application for capturing the sound of the action during a football match by automatically mixing the signals of several microphones placed around the pitch and selecting only those microphones which are close to, or aiming at, the action. The sound engineer is presented with a user interface where he or she can define and move dynamically the point of interest on a screen representing the pitch, while the application controls the faders of the broadcast console. The technology has been applied in the context of a three-dimensional surround sound playback of a Spanish first-division match.
By G. Cengarle, T. Mateos, N. Olaiz, P. Arumi, BM
May, 2010   Title: "Measuring impulse responses using speech and music"
Paper to download   Continuous measurement of room impulse responses (RIRs) in the presence of an audience has many applications for room acoustics: in-situ loudspeaker/room equalization; teleconferencing; and for architectural acoustic diagnostics. A continuous analysis of the RIR is often preferable to a single measurement, especially with non-stationary room characteristics such as from changing atmospheric or audience conditions. This paper discusses the use of adaptive filters updated according to the NLMS algorithm for fast, continuous in-situ RIR acquisition; particularly when the input signal is music or speech. We show that the dual-channel FFT (DCFFT) method has slower convergence and is less robust to coloured signals such as music and speech. Data is presented comparing the NLMS and the DCFFT methods and we show that the adaptive filter approach provides RIRs with high accuracy and high robustness to background noise using music or speech signals
By J. Usher, BM
June, 2010   Title: "Multi-label depth estimation for graph cuts stereo problems"
Paper to download   We describe here a method to compute the depth of a scene from a set of at least two images taken at known view-points. Our approach is based on an energy formulation of the 3D reconstruction problem which we minimize using a graph-cut approach that computes a local minimum whose energy is comparable with the energy of the absolute minimum. As usually done, we treat the input images symmetrically, mach pixels using photo consistency, treat co lusion and visibility problems and we consider a spatial regularization term which preserves discontinuities. The details of the graph construction as well as the proof of the correctness of the method are given. Moreover we introduce a multi-label refinement algorithm in order to increase the number of depth labels without signicantly increasing the computational complexity. Finally we compared our algorithm with the results available in the Middlebury database
By N. Papadakis and V.Caselles, BM
June, 2010   Title: "Compensation of the afterglow phenomenon in 2-D discrete-time simulation"
Paper to download   Due to high computational costs, the physics governed by the wave equation in 3D is often modeled via discrete-time numerical simulations conducted in 2D scenarios. Results are normally generalized to 3D rather straightforwardly, overlooking the fact that the propagation of a point-like impulse in 2D exhibits the so-called afterglow phenomenon, which consists on the fact that non-null field values are measured after the arrival of the first wavefront. This paper analyzes the consequences of this phenomenon, both theoretically and numerically, and presents a simple method to compensate for it. Verification of this method is presented using various discrete-time numerical schemes.
By T. Mateos, J. Escolano, C. Spa, A. Garriga, BM
July 5 , 2010   Title: "Evaluating Detection of Near Duplicate Video Segments"
Paper to download   (...)In this paper we have implemented several evaluation measures found in literature and we apply them to real algorithm outputs and a simulated result data set. We then calculate the correlation between the results obtained with the di erent measures in order to investigate whether they can be compared or not. The results show that the correlation between the measures is some cases quite low, and some measures are especially sensitive to certain types of deviations from the ground truth.
By Werner Bailer, JRS
July 25 , 2010   Title: "2020 3D Media: New directions in Immersive Entertainment"
Paper to download   (Poster abstract) This research project is conducted by a consortium of European industrial and academic partners that include companies like: Technicolor, Digital Projection, DTS Europe, Doremi, Mediapro, Creative Wokers (CREW) and Datasat, and research centers: Barcelona Media, Joaneum research, University of Hasselt, University of Reading and Fraunhoffer. It is aimed to research, develop and demonstrate novel forms of compelling entertainment experiences based on new technologies for the capture, production, networked distribution and display of three-dimensional sound and images.
By S. Fort, BM
August , 2010   Title: "Spatial string matching for image classification"
Paper to download   This paper presents a spatial string matching method to incorporate spatial information into the bag-ofwords model, which represents an image as an unordered distribution of local features. Spatial constraints among neighboring features are explored in order to achieve better discrimination power for image classification. The features from neighboring points are combined together and taken as a spatial string, and then our method matches the images according to the similarity of string pairs. The categorization problem can be formulated using KNN or SVM classifier based on the spatial string matching kernel. The proposed method is able to capture spatial dependencies across the neighboring features. Experiment results show promising performance for image classification tasks.
By Yunqiang Liu and Vicent Caselles, BM
August 23 , 2010   Title: "Stereoscopic image inpainting: distinct depth maps and images inpainting"
Paper to download   In this paper we propose an algorithm for in painting of stereo images. The issue is to reconstruct the holes in a pair of stereo image as if they were the projection of a 3D scene. Hence, the reconstruction of the missing information has to produce a consistent visual perception of depth. Thus, first step of the algorithm consists in the computation and in painting of disparity maps in the given holes. The second step of the algorithm is to fill-in missing regions using the complete disparity maps in a way that avoids the creation of 3D artifacts. We present some experiments on several pairs of stereo images.
By Alexandre Hervieu, Nicolas Papadakis, Aurélie Bugeau, Pau Gargallo, Vicent Caselles, BM
September 9 , 2010   Title: "A New Infrastructure for High Resolution 3D and Slow Motion Production-Sets Featuring the 'FlashPak II' Field-Recorder Demonstrator"
Paper to download   The current success of immersive 3D experiences in feature films and the resulting move to push such technology into TV call for the extension of today's standard digital HD or 2K/4K capture process towards stereo imaging or other means of depth representation. Recently, stereoscopic high resolution capture rigs are employing proprietary workflows and a non-unified infrastructure based on legacy technology taken from the 2D domain. As this sudden demand for at least doubled recording bandwidth and storage space are clearly stretching the practicability limits of this legacy infrastructure, the already overdue paradigm shift towards a modern, future proof capture and recording infrastructure is getting another push. This paper describes a comprehensive infrastructure approach based on new interface and field recording technologies for capture devices that targets the challenging requirements of today's and future 3D production setups. The new interface approach harmonizes and consolidates image data, depth information and general metadata to ease 3D production workflows. Furthermore, a handy field recorder solution based on solid state technology is introduced that is able to record up to seven uncompressed camera streams in parallel.
By T. Brune and J.P. Wittenburg, Technicolor
September 20 , 2010   Title: "CUDA Acceleration of Color Histogram Matching"
Paper to download   A common approach to histogram matching is done by means of the cumulative distribution functions (CDFs). First, we calculate their normalized histograms (hA and hB) and their respective cumulative histogram distribution functions(CDFA and CDFB). Then, a matching between the CDFs are performed. Given the reference CDF, CDFA, for each gray level GI we find the corresponding gray level GJ in which CDFA(GI) = CDFB(GJ), if I and J correspond to different values then a replacement of a gray level in the image IB is performed in order to match their CDFs. Our approach considers the ideas of [Rolland et al. 2000] with a Nvidia 3D broadcast solution system using professional HD cameras.
By Antonio S. Montemayor, Raúl Cabido, Juan José Pantrigo, URJC, and Xavi Rodríguez, Sergi Sagàs, Mediapro
September 26 , 2010   Title: "Polyconvexification of the multi-label optical flow problem"
Paper to download   In this paper the problem of optical flow and occlusion mask estimation is aborded. To that end, we consider a multi-label representation of the optical flow and we define an energy that models the problem. The convexification of the energy and its minimization with an iterative algorithm are studied. Our algorithm is implemented in GPU, since each pixel can be processed in parallel. From our experiments, the relation between the quality of the results obtained and computing time seems to be very promising.
By N. Papadakis, A. Baeza, P. Gargallo, BM and V. Caselles, UPF
October, 2010   Title: "A Comprehensive Framework for Image Inpainting"
Paper to download   Inpainting is the art of modifying an image in a form that is not detectable by an ordinary observer. There are numerous and very different approaches to tackle the inpainting problem, though as explained in this paper, the most successful algorithms are based upon one or two of the following three basic techniques: copy-and-paste texture synthesis, geometric partial differential equations (PDEs), and coherence among neighboring pixels. We combine these three building blocks in a variational model, and provide a working algorithm for image inpainting trying to approximate the minimum of the proposed energy functional. Our experiments show that the combination of all three terms of the proposed energy works better than taking each term separately, and the results obtained are within the state-of-the-art.
By A. Bugeau, M. Bertalmio, V. Caselles, G. Sapiro, BM, UPF
November 17 , 2010   Title: "Tracking and Clustering Salient Features in Image Sequences"
Paper to download   Many applications in media production need information about moving objects in the scene, e.g. insertion of computer-generated objects, association of sound sources to these objects or visualization of object trajectories in broadcasting. We present a GPU accelerated approach for detecting and tracking salient features in image sequences and we propose an algorithm for clustering the obtained feature point trajectories in order to obtain a motion segmentation of the set of feature trajectories. Evaluation results for both the tracking and clustering steps are presented. Finally we discuss the application of the proposed approach for associating audio sources to objects to support audio rendering for virtual sets.
By Jakub Rosner, Silesian University and Werner Bailer, Hannes Fassold, Felix Lee, JRS
November , 2010   Title: "Virtual Camera Synthesis for Soccer Game Replays"
Paper to download   In this paper, we present a set of tools developed during the creation of a platform that allows the automatic generation of virtual views in a live soccer game production. Observing the scene through a multi-camera system, a 3D approximation of the players is computed and used for the synthesis of virtual views. The system is suitable both for static scenes, to create bullet time effects, and for video applications, where the virtual camera moves as the game plays.
By N. Papadakis, A. Baeza,  X. Armangué, I. Rius, A. Bugeau, O. D'Hondt, P. Gargallo, V. Caselles and S. Sagas, BM, Mediapro
November , 2010   Title: "Unsupervised Motion Layer Segmentation by Random Sampling and Energy Minimization"
Paper to download   In this paper we introduce an unsupervised scheme for the segmentation of motion layers in video sequences. The number of layers is automatically determined by the method. Our approach first extracts the motion models thanks to a RANSAC-based random sampling algorithm improved by the use of geodesic distance information. Then those models are assigned to pixels in the color image by minimizing an energy functional thanks to graph-cut. Our energy takes into account motion residuals, color distributions, geodesic distance as well as temporal consistency of the layers. Moreover, we define a smoothness term that enforces a patch-wise spatial coherency on areas where optical flow is reliable and a pixel-wise coherency on occluded areas. The method leads to promising results on the tested sequences.
By Olivier D'Hondt and Vicent Caselles, BM
November 17 , 2010   Title: "A Comprehensive Infrastructure and Workflow for Acquisition of High-resolution Multi-view Content"
Paper to download   The current success of immersive 3D experiences in feature films and the trend towards 3D TV require advanced tools and workflows for high-quality capture of multi-view live scenes including depth information. These requirements are not fulfilled by today's capture workflows and infrastructures based on legacy technology from 2D capture solutions. We propose a comprehensive infrastructure for capture of multiview content supporting different methods to obtain depth information. The paper describes a device for field capture and tools for online and offline content analysis and browsing of the captured content that have been integrated into the proposed infrastructure.
By Werner Bailer, Christian Schober, JRS and Thomas Brune, Technicolor
December 28, 2010   Title: "A Variational Framework for Exemplar-Based Image Inpainting"
Paper to download   In this paper we propose an algorithm for inpainting of stereo images. The issue is to reconstruct the holes in a pair of stereo image as if they were the projection of a 3D scene. Hence, the reconstruction of the missing information has to produce a consistent visual perception of depth. Thus, first step of the algorithm consists in the computation and inpainting of disparity maps in the given holes. The second step of the algorithm is to fill-in missing regions using the complete disparity maps in a way that avoids the creation of 3D artifacts. We present some experiments on several pairs of stereo images.
By Pablo Arias, Gabriele Facciolo, Guillermo Sapiro, Vicent Caselles, BM
January 4, 2011   Title: "Stereoscopic Image Inpainting Using Scene Geometry"
Paper to download   In this paper we propose an algorithm for inpainting of stereo images. The issue is to reconstruct the holes in a pair of stereo image as if they were the projection of a
3D scene. Hence, the reconstruction of the missing information has to produce a consistent visual perception of depth. Thus, first step of the algorithm consists in the computation and inpainting of disparity maps in the given holes. The second step of the algorithm is to fill-in missing regions using the complete disparity maps in a way that avoids the creation of 3D artifacts. We present some experiments on several pairs of stereo images.
By Alexandre Hervieu, Nicolas Papadakis, Aurélie Bugeau, Pau Gargallo, Vicent Caselles, BM
January 15 , 2011   Title: "Video Browsing Using Object Trajectories"
Paper to download   Video browsing methods are complementary to search and retrieval approaches, as they allow for exploration of unknown content sets. Objects and their motion convey important semantics of video content, which is relevant information for video browsing. We propose extending an existing video browsing tool in order to support clustering of objects with similar motion and visualization of the objects’ positions and trajectories. This requires the automatic extraction of moving objects and estimation of their trajectories, as well as the ability to group objects with similar trajectories. For the first issue we describe the application of a recently proposed motion trajectory clustering algorithm, for the second we use k-medoids clustering and the dynamic time warping distance. We present evaluation results of both steps on real world traffic sequences from the Hopkins155 data set. Finally we describe the description of analysis results using MPEG-7 and the integration into the video browsing tool.
By Felix Lee and Werner Bailer, JRS

Go to top

[End of academic publications]

EU funded projects

 NAME  DESCRIPTION
3D Interactive and Immersive
Media Cluster
  3D Media Cluster is the main umbrella structure embracing related EC funded projects to develop joint strategic goals towards 3D Media in the context of future Internet.
www.3dmedia-cluster.eu
Mobile 3DTV   MOBILE3DTV - Mobile 3DTV Content Delivery Optimization over DVB-H System - is a three-year project partly funded by the European Union 7th RTD Framework Programme in the context of the Information & Communication Technology (ICT) Cooperation Theme and its objective 1.5 Networked Media. The project started on 1st January 2008. The main objective of MOBILE3DTV is to demonstrate the viability of the new technology of mobile 3DTV. The project develops a technology demonstration system for the creation and coding of 3D video content, its delivery over DVB-H and display on a mobile device.
http://sp.cs.tut.fi/mobile3dtv/
3D4YOU   The 3D4YOU concept is to establish practical 3D television. The key success factor is 3D content. The project will define a 3D delivery format and a content creation process. Establishing practical 3D television will then be demonstrated by embedding this content creation process into a 3DTV production and delivery chain, including capture, image processing, delivery and then display in the home.
http://www.3d4you.eu
3D Presence   The 3D Presence project proposes a research and development agenda that is both timely and necessary. It is born from the realization that effective communication and collaboration with geographically dispersed co-workers, partners, and customers requires a natural, comfortable, and easy-to-use experience that utilizes the full bandwidth of non-verbal communication.
http://www.3dpresence.eu/
FOCUS K 3D   FOCUS K3D (ICT-2007-214993) is a Coordination Action (CA) of the European Union's 7th Framework Programme which aims at promoting the adoption of best-practices for the use of semantics in 3D content modeling and processing.
The project started on March 2008, and will finish on February 2010 (24 months).
http://www.focusk3d.eu/
i3DPost   i3Dpost will improve quality and reduce the cost of high-level media production by applying intelligent technologies to the extraction of structured 3D content models from video. This will enable the increasingly automatic manipulation and re-use of characters, with changes of viewpoint and lighting. The research will advance the state of the art in 3D video production, 3D motion estimation, post-production tools and media semantics. The result will be film quality 3D content in a structured form, with semantic tagging, which can be manipulated in a graphic production pipeline and used across different media platforms.
http://www.i3dpost.eu/
3DPHONE   The project aims to develop technologies and core applications enabling a new level of user experience by developing end-to-end all-3D imaging mobile phone.
http://the3dphone.eu
MUTED   The MUTED project aims to produce the first 3D TVdisplay capable of supporting multiple mobile viewers simultaneously, and without the need for 3D glasses.
http://www.muted3d.eu

Go to top

Professional associations

 NAME  DESCRIPTION
Stereoscopic Society   The Society has been in existence since 1893 and its objectives are to foster the practice, enjoyment and advancement of all forms of stereoscopy - pictures in three dimensions.

www.stereoscopicsociety.org.uk

International
Stereoscopic Union
  The International Stereoscopic Union (ISU) was founded in 1975 and is the only international 3D association in the world. The ISU is a club of individual 3D enthusiasts as well as a club of stereo clubs. The ISU's members currently number more than 1,000 and come from over 40 countries world-wide.

http://www.stereoscopy.com/isu/index.html

Go to top

Magazines

 NAME  DESCRIPTION
Stereoscopy news   This web magazine is fully dedicated to stereoscopy. There is also a compact newsletter, born in September 2009, that is distributed weekly by email

www.stereoscopynews.com

3D review.com   Information about new stereoscopic products from around the world

http://www.rollanet.org/~vbeydler/van/3dreview/

Go to top