| DATE |
|
DESCRIPTION |
| May 30, 2008 |
|
Title:
"Project Hand Book and Quality Plan" |
 |
|
This document is the Project Handbook for the 2020 3D Media Project
By Eugenia Fuenmayor and Elena Torres, Barcelona Media. |
|
| May
30, 2008 |
|
Title:
"2020 3D Media Initial IP Strategy Document" |
 |
|
This document describes the way Intellectual Property is handled inside the 2020 3D Media
consortium
By Eugenia Fuenmayor and Elena Torres, Barcelona Media. |
|
| May
30, 2008 |
|
Title:
"Dissemination Stategy" |
 |
|
Initial version of the plan for disseminating the activities and the
generated knowledge and products of the project.
By Santi Fort, Barcelona Media. |
|
| August
22, 2008 |
|
Title:
"Self-assessment Plan" |
 |
|
Quality Management is a Consortium Management task, designed to assure that the project keeps on track and meets the goals.
By Tom Evans and Peter Stansfield, coordinated by Barcelona Media.
|
|
| August
28, 2008 |
|
Title:
"Definition of a new camera architecture" |
 |
|
This document describes the changes needed in the architecture of a regular
professional 2D camera to turn it into a camera that can capture scenes in 2D + depth.
By Klaas Jan Damstra, Grass Valley Netherland |
|
| August
31, 2008 |
|
Title:
"Requirements and definition of new architecture for combining Viper camera with Trifocal Stereo and Structured Light Devicest" |
 |
|
3D Image Capture combining tri-focal stereovision, spatio-temporal structured light and
a cinematographic camera with Z-channel into a unified video-and-depth recording device.
By Ralf Tanger, Fraunhofer HHI and others |
|
| October
21, 2008 |
|
Title:
"Report on magnification errors in the optical path and possible algorithms to create higher resolution out of 2K channels" |
 |
|
Market trends and state-of-the-art
in technology with respect to resolution, frame rates and bit
depth concerning high-end cameras to be designed
By Klaas Jan Damstra, Grass Valley Netherland |
|
| February
25, 2009 |
|
Title:
"Support for multi-view content in metadata
standards and formats" |
 |
|
This document collects the metadata elements required for
the production of multi-source digital cinema content.
By Werner Bailer, JRS and others. |
|
| February
13, 2009 |
|
Title: "A new in-field recording architecture" |
 |
|
Specifycation of the in-field recorder for the need of this project respectively for a general proposed future 3D acquisition approach
By Thomas Brune, DTO |
|
| February
11, 2009 |
|
Title:
"Algorithm to create a higher resolution
demonstrated on PC" |
 |
|
Results of the implementation of the demonstrator of the offset processing
By Klaas Jan Damstra, Grass Valley Netherland |
|
| February
28, 2009 |
|
Title:
"Improved algorithms for 2D-3D image conversion" |
 |
|
2D-3D conversion consists in turning into 3D any 2D
content.
By Philippe Gérard, Doremi. |
|
| February
25, 2009 |
|
Title:
"Image inpainting methods and multi-view image
processing" |
 |
|
Proposal of an image inpainting method adapted to artefact correction in the context of view
synthesis.
By Vicent Caselles, Barcelona Media. |
|
| February
26, 2009 |
|
Title:
"Specification of design and implementation of
automated distribution system." |
 |
|
This document identifies the main requirements and
specifies the software functionality for automated
identification and distribution of content.
By Dave Stringer, Datasat. |
|
| February
28, 2009 |
|
Title:
"Requirements and Standards for Spatialised Audio
play-out" |
 |
|
This document collects the proposal for
a spatial audio format, suitable for distribution which is
independent of the exhibition system.
By Toni Mateos, Barcelona Media. |
|
| June 6 , 2009 |
|
Title:
"Market Analysis, Business Models and Exploitation Planning" |
 |
|
This document gives an overview of the markets and technology involved in 3-D stereoscopic content creation and distribution, and also the business models to support the exhibition of such contents.
By Jordi Alonso , Mediapro. |
|
| May 29 , 2009 |
|
Title:
"New camera architecture design" |
 |
|
This report describes what has been achieved until now with respect to the design and (partly) use of the new camera
architecture in the 2020 3D Media project.
|
|
| October 15 , 2009 |
|
Title:
"Workflow Design and Experimental Testing" |
 |
|
Considerations on the design of a workflow by the experience of the production of a 3D content using multiple different cameras and tools.
|
|
| July 31 , 2009 |
|
Title:
"Content analysis tools for multi-source content" |
 |
|
This document describes the prototype of the multi-view content analysis framework.
|
|
| August 25 , 2009 |
|
Title:
"Prototype recording solution prepared for enhanced resolution application" |
 |
|
This paper describes a first version prototype of a new portable field recorder solution for 3D image capture.
|
|
| August 26 , 2009 |
|
Title:
"Initial Prototype 2K Viper Camera with Satellite Cameras" |
 |
|
Hardware and software components of a trifocal depth capture prototype are described.
|
|
| September 02 , 2009 |
|
Title:
"Initial prototype 2K camera with structured light beamer" |
 |
|
This document describes our initial prototype of a 2K resolution, real-time capable, depth capture method for cinematographic or broadcast use.
|
|
| September 08 , 2009 |
|
Title:
"Immersive/augmented/3D display system v1" |
 |
|
Description of architecture of initial prototype for displaying immersive and 3D images.
|
|
| November 27 , 2009 |
|
Title:
"Prototype (2K) Viper camera with combined satellite/beamer" |
 |
|
Components of a depth capture prototype combining a structured light and a trifocal approach are described.
|
|
| February 28 , 2010 |
|
Title:
"Semantically Driven Workflow System" |
 |
|
This report presents the Semantically Driven Work-flow System prototype as derived from requirements collected, exposed and analysed.
|
|
| February 26 , 2010 |
|
Title:
"Extensions of MPEG-C Part3, MVC and MXF for advanced depth and 3D audio representation" |
 |
|
This document summarizes the metadata specification throughout the workflows covered by 2020 3D Media.
|
|
| February 26 , 2010 |
|
Title:
"Higher resolution algorithm demonstrated on new camera architecture" |
 |
|
A short description and an MPEG video demo on the enhanced resolution demo package.
|
|
| February 26 , 2010 |
|
Title:
"Plug-ins to import depth-enhanced image and auxiliary data" |
 |
|
This deliverable describes an OpenFx plug-in for post-production tools such as Nuke, Mistika and others.
|
|
| February 18 , 2010 |
|
Title:
"Object segmentation and re-detection algorithm for multi-view content v1" |
 |
|
This deliverable describes the prototypes for fulfilling the pre-processing tasks for object redetection.
|
|
| February 26 , 2010 |
|
Title:
"Deconvolution method for audio in real-time" |
 |
|
The research described here focuses on two blind-algorithms for sound deconvolution.
|
|
| February 26 , 2010 |
|
Title:
"Automated distribution systems feeding Cinema and/or public spaces" |
 |
|
The deliverable presents the initial protptype of a digital distribution system which operates on metadata and orders, to transfer the required files to destinations in an efficient and timely manner.
|
|
| March 08 , 2010 |
|
Title:
"Evaluation procedures" |
 |
|
This deliverable presents the evaluation methodologies to be followed in order to gather feedback from professional and end users.
|
|
|
| DATE |
|
DESCRIPTION |
| October16, 2008 |
|
Title:
"Being inside the image. Heightening the sense of presence in a video captured
environment through artistic means: the case of CREW." |
 |
|
This paper explores the use of omni-directional video, a high impact immersive medium, by a performance group and multi-disciplinary team of artists and scientists.
By Nele Wynants, Kurt Vanhoutte,
Theatre, Film and Literature Studies,
University of Antwerpen and
Philippe Bekaert , Expertise Centre for Digital Media,
University of Hasselt |
|
| September 12 , 2008 |
|
Title:
"Cognitive immersion " |
 |
|
The goal of this paper is to present our research about the
cognitive processes involved in 3D immersion and present
our research plans about how this type of environment should
be helpful for people with special needs, in particular people
with cerebral paralysis.
By Elena Parra and Raquel Navarro, BM |
|
| June 22 , 2008 |
|
Title:
"Video Enhancement Using Reference Photographs" |
 |
|
Handheld digital video cameras have become increasingly
popular and cheaper in recent years. Even still cameras
offer additional functionality for shooting videos. Unfortunately,
the small form factor of these devices limits the light ensitivity,
and often the lens and sensor do not allow for satisfactory image
quality. We wish to process a video into a
more aesthetically pleasing version, by borrowing information from
high quality reference photographs of the same scene. Since the
process of taking a photograph is not time-critical, we can afford
a longer exposure for reducing noise, and record more information
to increase resolution.
By Cosmin Ancuti, Tom Haber, Tom Mertens, Philippe Bekaert, Hasselt University |
|
| June 22 , 2008 |
|
Title:
"Eunomia: Toward a Framework for Multi-touch
Information Displays in Public Spaces" |
 |
|
Multi-touch interaction techniques are becoming more
widespread because of new industrial initiatives to make
this hardware available and affordable for the consumer
market. To cope with the diversity in hardware setups and
the lack of knowledge about developing generic multitouch
applications, we created a framework, Eunomia, for
abstracting the hardware from the software and to enable
software developers to easily develop interactive
applications taking advantage of multi-touch interaction.
By Tom Cuypers, Jan Schneider, Johannes Taelman, Kris Luyten, Philippe Bekaert, Hasselt University |
|
| June 26 , 2008 |
|
Title:
"IT leverage for media acquisition: New paradigms in the key area of digital cinematography and HD production workflows" |
 |
|
This paper proposes the demonstration of a comprehensive novel acquisition infrastructure based on standard 10G Ethernet interface technology. This infrastructure is suitable for digital cinematography as well as HD production workflows and comprises camera, mobile solid state recorder and monitor all connected through a 10G Ethernet based network. Optical Ethernet technology enables bidirectional, high speed, long distance networking at affordable cost as conventional IT equipment - such as switches - are fully supported. Interface and file formats enable versatile metadata support and provide a back channel to the camera for control purposes. Advances in solid state storage capacity in combination with interface performance eliminate any need for costly and quality compromising compression.
By Brune, T. Kochale, A. Wittenburg, J.P, Technicolor |
|
| January 1 , 2009 |
|
Title:
"Time-triggered Static Schedulable Dataflows for Multimedia Systems" |
 |
|
Software-based reactive multimedia computation systems are pervasive today in desktops but also in mobile and ultra-portable devices. Most such systems offer a callback-based architecture to incorporate specific stream processing. Article presented at the Electronic Imaging - Multimedia Networking and Computing conference, in January 2009, in San Jose, California
By Pau Arumí, BM and Xavier Amatriain, Telefonica Research |
|
| May 1, 2009 |
|
Title:
"3D-Audio with CLAM and Blender's Game Engine" |
 |
|
Blender can be used as a 3D scene modeler, editor,
animator, renderer and Game Engine. This paper
describes how it can be linked to a 3D sound plat
form working within the CLAM framework, making
special emphasis on a specific application: the recently launched Yo Frankie! open content game for
the Blender Game Engine. The game was hacked
to interact with CLAM, implementing spatial scene
descriptors transmited over the Open Sound Control
protocol, and allowing to experiment with many different spatialization and acoustic simulation algorithms. Further new applications of this Blender-
CLAM integration are also discussed.
By Natanael Olaiz, Pau Arumí, Toni Mateos and David Garcia, BM |
|
| March , 2009 |
|
Title:
"A Framework for Networked Interactive Surfaces" |
 |
|
The development of interactive surfaces has led to a number of applications as it allows for natural interaction and collaboration. The use of co-located collaboration on this kind of surfaces provides new possibilities. It is our belief, however, that an even higher degree of collaboration can be achieved by overcoming the boundaries of a single device or setup. Therefore, we extended our previously built multi-touch framework to realize collaborative multi-device setups. In order to assess the framework, a networked virtual world and three other applications were built to demonstrate its power.
By Tom Cuypers, Karel Frederix, Chris Raymaekers, Philippe Bekaert, Univ. of Hasselt. |
|
| March 23 , 2009 |
|
Title:
"Estimating 3D camera motion for rendering
audio in virtual scenes" |
 |
|
In the production phase of digital cinema content it is
important to allow the director not only to preview the final rendered scene early in the shooting process, but also to prehear the 5.1 surround and HRTF-based binaural versions, thus enabling visual and auditive artistic decisions to be taken at the shooting stage. In many cases camera tracking data is
not available for all cameras on the set (e.g. handheld ones) and thus the motion of the camera needs to be estimated. In this paper we describe an approach for estimation of 3D camera motion and its use for real-time audio rendering.
By Werner Bailer, JRS, and Pau Arumí, Toni Mateos, Adan Garriga, Jaume Durany, David García, BM. |
|
| May , 2009 |
|
Title:
""Summarizing Raw Video Material Using Hidden Markov Models" |
 |
|
Besides the reduction of redundancy the selection of representative segments is a core problem when summarizing collections of raw video material. We propose a novel approach for the selection of segments to be included in a video summary based on Hidden Markov Models (HMM), which are trained on an annotated subset of the content. The observations of the HMM are relevance judgments of content segments based on different visual features, the hidden states are the selection/non-selection of content segments. The HMM is designed to take all relevant scenes into account. We show that the approach generalizes well when trained on sufficiently diverse content
By Werner Bailer and Georg Thallinger, JRS, |
|
| June 20 , 2009 |
|
Title:
"Depth from sliding projections" |
 |
|
In this paper we present a novel method for 3D structure acquisition, based on structured light. Unlike classical structured light methods, in which a static projector illuminates a scene with time-varying illumination patterns, our technique makes use of a moving projector emitting a static striped illumination pattern. This projector is translated at a constant velocity, in the direction of the projector's horizontal axis. Illuminating the object in this manner allows us to perform a per pixel analysis, in which we decompose the recorded illumination sequence into a corresponding set of frequency components. The dominant frequency in this set can be directly converted into a corresponding depth value. This per pixel analysis allows us to preserve sharp edges in the depth image. Unlike classical structured light methods, the quality of our results is not limited by projector or camera resolution, but is solely dependent on the temporal sampling density of the captured image sequence. Additional benefits include a significant robustness against common problems encountered with structured light methods, such as occlusions, specular reflections, subsurface scattering, interreflections, and to a certain extent projector defocus.
By Tom Cuypers, Karel Frederix, Chris Raymaekers, Philippe Bekaert, Univ. of Hasselt. |
|
| June 20 , 2009 |
|
Title:
"Capturing Multiple Illuminations using Time and Color Multiplexing" |
 |
|
Many vision and graphics problems such as relighting, structured light scanning and photometric stereo, need images of a scene under a number of different illumination conditions. It is typically assumed that the scene is static. To extend such methods to dynamic scenes, dense optical flow can be used to register adjacent frames. This registration becomes inaccurate if the frame rate is too low with respect to the degree of movement in the scenes. We present a general method that extends time multiplexing with color multiplexing in order to better handle dynamic scenes. Our method allows for packing more illumination information into a single frame, thereby reducing the number of required frames over which optical flow must be computed. Moreover, color-multiplexed frames lend themselves better to reliably computing optical flow. We show that our method produces better results compared to time multiplexing alone. We demonstrate its application to relighting, structured light scanning and photometric stereo in dynamic scenes..
By Bert De Decker, Jan Kautz, Tom Mertens and Philippe Bekaert, Univ. of Hasselt. |
|
| July , 2009 |
|
Title:
"Methodology for studying the numerical speed of sound in finite differences" |
 |
|
The space and time discretization inherent to all FDTD schemes introduce non-physical dispersion errors, i.e. deviations of the speed of sound from the theoretical value predicted by the governing Euler differential equations. A generalmethodology for computing this dispersion error via straightforward numerical simulations of the FDTD schemes is presented. The method is shown to provide remarkable accuracies of the order of 1/1000 in a wide variety of twodimensional finite difference schemes.
By C. Spa, UPF, T. Mateos, A. Garriga, BM |
|
| September , 2009 |
|
Title:
"Shadow Multiplexing for Real-Time Silhouette Extraction" |
 |
|
Automatic detection and tracking of feature points is an important part of many computer vision methods. A
widely used method is the KLT tracker proposed by Kanade, Lucas and Tomasi. This paper reports work done
on porting the KLT tracker to the GPU, using the CUDA technology by NVIDIA. describing the CUDA port of the point tracker has been accepted as full paper at the Computer Graphics, Computer Vision and Mathematics (GraVisMa) workshop.
By Tom Cuypers, Yannick Francken, Johannes Taelman, Philippe Bekaert, Univ. of Hasselt |
|
| September 4 , 2009 |
|
Title:
"Realtime KLT feature point tracking for High Definition video" |
 |
|
Automatic detection and tracking of feature points is an important part of many computer vision methods. A
widely used method is the KLT tracker proposed by Kanade, Lucas and Tomasi. This paper reports work done
on porting the KLT tracker to the GPU, using the CUDA technology by NVIDIA. describing the CUDA port of the point tracker has been accepted as full paper at the Computer Graphics, Computer Vision and Mathematics (GraVisMa) workshop.
By Hannes Fassold, Jakub Rosner, Peter Schallauer and Werner Bailer, JRS |
|
| December, 2009 |
|
Title:
"Relighting Objects from Image Collections" |
 |
|
We present an approach for recovering the reflectance of a static scene with known geometry from a collection of images taken under distant, unknown illumination. In contrast to previous work, we allow the illumination to vary between the images, which greatly increases the applicability of the approach. Using an all-frequency relighting framework based on wavelets, we are able to simultaneously estimate the per-image incident illumination and the persurface point reflectance. The wavelet framework allows for incorporating various reflection models. We demonstrate the quality of our results for synthetic test cases as well as for several datasets captured under laboratory conditions. Combined with multi-view stereo reconstruction, we are even able to recover the geometry and reflectance of a scene solely using images collected from the Internet.
By Tom Haber, Christian Fuchs, Philippe Bekaert, H.P. Seidel. Univ. of Hasselt |
|
| December, 2009 |
|
Title:
"Gloss and Normal Map Acquisition of Mesostructures Using Gray Codes" |
 |
|
We propose a technique for gloss and normal map acquisition of fine-scale specular surface details, or mesostructure. Our main goal is to provide an efficient, easily applicable, but sufficiently accurate method to acquire mesostructures. We therefore introduce a setup consisting of cheap and accessible components, including a regular monitor and a digital still camera. We build on our previously proposed method [Francken et al. 2008], which acquires normal maps by analyzing the reflection of binary patterns (Gray codes). These patterns are successively refined in this process. The key idea in this paper, is that this refinement also allows us to measure the shininess for each spatial location, resulting in a gloss map. Acquiring spatially varying reflectance usually requires a complicated hardware setup, which measures the BRDF at each spatial location. Our method is much simpler and cheaper. Even though we assume a very simplified BRDF model, our technique is able to reproduce the mesostructure's appearance faithfully
By Yannick Francken, Tom Cuypers, Tom Mertens, Philippe Bekaert. Univ. of Hasselt |
|
| December 9, 2009 |
|
Title:
"Metadata for Creation and Distribution of Multi-view Video" |
 |
|
This paper reviews
the metadata requirements for multi-view video content and analyze
how well these requirements are covered in existing metadata standards,
both in terms of the coverage of metadata elements and the capabilities
to structurally describe multi-view video content. Workshop on Semantic Multimedia Database Technologies (SeMuDaTe)
By Bailer W., Höffernig M. JRS |
|
| December 9 , 2009 |
|
Title:
"Formal Metadata Semantics for Interoperability in the Audiovisual Media Production Process" |
 |
|
Dierent metadata formats and standards are used in the stages of
the process, containing metadata properties with similar and partly overlapping
semantics, though not fully identical. We attempt to model the
metadata properties used throughout the production process in a format
independent way by creating an ontology that models these properties
and the relations between them.
By Höffernig M., Bailer W, JRS |
|
| December 12 , 2009 |
|
Title:
"How to Align Media Metadata Schemas? Design
and Implementation of the Media Ontology" |
 |
|
Multimedia data is generated, shared, stored and distributed
worldwide at an ever increasing rate. This huge amount of content comes
with metadata represented in different formats which hardly interoperate
although they partially overlap.
By Florian Stegmaier1, Werner Bailer2, Tobias B¨urger3, et al., JRS |
|
| November 2 , 2009 |
|
Title:
"Fast GPU-based KLT feature point tracking
using CUDA" |
 |
|
Automatic selection and tracking of feature points is a basic task
for many CV algorithms (e. g. structure from motion, object tracking) Very popular due to good performance:
KLT algorithm [Kanade, Lucas and Tomasi]
By Florian Stegmaier1, Werner Bailer2, Tobias B¨urger3, et al., JRS |
|
| November 2 , 2009 |
|
Title:
"Comparing Fact Finding Tasks and User Survey for
Evaluating a Video Browsing Tool" |
 |
|
In the multimedia
information retrieval community evaluations following the
Cranfield paradigm (as e.g. used in TRECVID) have been
widely adopted. We have applied two TRECVID style fact
finding approaches (retrieval and question answering tasks)
and a user survey to the evaluation of a video browsing tool.
By Werner Bailer and Herwig Rehatschek, JRS |
|
| December 17 , 2009 |
|
Title:
"A Video Browsing Tool for Content Management in Post-production" |
 |
|
We propose an interactive video browsing tool for supporting content management and selection in postproduction. The approach is based on a process model for multimedia content abstraction. A software framework based on this process model and desktop and Web-based client applications are presented. International Journal of Digital Multimedia Broadcasting, 2010
By Werner Bailer, Wolfgang Weiss, Gert Kienast, Georg Thallinger, and Werner Haas, JRS |
|
| July 14 , 2009 |
|
Title:
"Integrative Workflow-Embedded Support For Process Automation And Management" |
 |
|
This paper presents the on-going research performed in order to integrate process automation and process management support in the context of media production. This has been addressed on the basis of a holistic approach to software engineering applied to media production modeling to ensure design correctness, completeness and effectiveness. European and Mediterranean Conference on Information Systems 2009
By Badii A., Fuschi D., Zhu M., Univ. of Reading |
|
| November 23 , 2009 |
|
Title:"Confronto sperimentale tra microfono B-format Soundfield e sonda intensimetrica di pressione e velocità Microflown" |
 |
|
Il funzionamento dei microfoni B-Format si basa sull'elaborazione dei segnali ottenuti da una configurazione coincidente tetraedrica di capsule microfoniche direzionali, che vengono combinati linearmente per fornire in uscita quattro canali, indicati con W, X, Y e Z, proporzionali rispettivamente alla pressione sonora ed alle tre componenti cartesiane del vettore di velocità acustica. Questi segnali vengono equalizzati per compensare effetti di interferenza dovuti alla non perfetta coincidenza spaziale delle capsule. Le sonde intensimetriche di ultima generazione Microflown sono costituite da un microfono di pressione e tre trasduttori di velocità operanti in base al principio dell'anemometria a doppio filamento caldo, permettendo così una misura diretta e pressoché coincidente delle variabili acustiche e v. Per via del particolare principio di funzionamento i sensori di velocità sono caratterizzati da una curva di risposta in ampiezza e fase descrivibile con una serie di tre filtri passa-basso con diverse frequenze di taglio, che devono essere accuratamente compensati nella fase di acquisizione del segnale.
By G. Cengarle, T. Mateos and D. Bonsi., BM |
|
| April , 2010 |
|
Title:
"A Narrow Band Method for the Convex Formulation of Discrete Multilabel Problems" |
 |
|
We study a narrow band type algorithm to solve a discrete formulation of the convex
relaxation of energy functionals with total variation regularization and non convex data terms. We
prove that this algorithm converges to a local minimum of the original non linear optimization
problem. We illustrate the algorithm with experiments for disparity computation in stereo and a
multi-label segmentation problem and we check experimentally that the energy of the local minimum
is very near to the energy of the global minimum obtained without the narrow band type method
By A. Baeza, V. Caselles, P. Gargallo, N. Papadakis., BM |
|
| April 1 , 2010 |
|
Title:
"Migrating from Process Automation to Process
Management Support: a Holistic Approach to
Software Engineering applied to Media
Production" |
 |
|
This paper presents the on-going research performed in order to migrate from process automation to process management support in the context of media production and more specifically 3D cinematographic immersive and interactive production.
By Badii A., Fuschi D., Zhu M., Univ. of Reading |
|
| April, 2010 |
|
Title:
"An improved method to determine the onset timings of reflections in an acoustic impulse response" |
 |
|
Determining the absolute onset time of reflections in an acoustic impulse response (IR) has applications for both subjective and physical acoustics problems. Although computationally simple, a first-order energetic analysis of the IR can lead to false-positive identification of reflections. This letter reports on a new method to determine reflection onset timings using a modified running local kurtosis analysis to identify regions in the IR where the distribution is non-normal. IRs from both real and virtual rooms are used to validate the new method and to find optimum analysis window sizes.
By J. Usher, BM |
|
| May, 2010 |
|
Title:
"A new technology for the assisted mixing of sport events: application to live football broadcasting" |
 |
|
This paper presents a novel application for capturing the sound of the action during a football match by automatically mixing the signals of several microphones placed around the pitch and selecting only those microphones which are close to, or aiming at, the action. The sound engineer is presented with a user interface where he or she can define and move dynamically the point of interest on a screen representing the pitch, while the application controls the faders of the broadcast console. The technology has been applied in the context of a three-dimensional surround sound playback of a Spanish first-division match.
By G. Cengarle, T. Mateos, N. Olaiz, P. Arumi, BM |
|
| May, 2010 |
|
Title:
"Measuring impulse responses using speech and music" |
 |
|
Continuous measurement of room impulse responses (RIRs) in the presence of an audience has many applications for room acoustics: in-situ loudspeaker/room equalization; teleconferencing; and for architectural acoustic diagnostics. A continuous analysis of the RIR is often preferable to a single measurement, especially with non-stationary room characteristics such as from changing atmospheric or audience conditions. This paper discusses the use of adaptive filters updated according to the NLMS algorithm for fast, continuous in-situ RIR acquisition; particularly when the input signal is music or speech. We show that the dual-channel FFT (DCFFT) method has slower convergence and is less robust to coloured signals such as music and speech. Data is presented comparing the NLMS and the DCFFT methods and we show that the adaptive filter approach provides RIRs with high accuracy and high robustness to background noise using music or speech signals
By J. Usher, BM |
|
| June, 2010 |
|
Title:
"Multi-label depth estimation for graph cuts stereo problems" |
 |
|
We describe here a method to compute the depth of a scene from a set of at least two images taken at known view-points. Our approach is based on an energy formulation of the 3D reconstruction problem which we minimize using a graph-cut approach that computes a local minimum whose energy is comparable with the energy of the absolute minimum. As usually done, we treat the input images symmetrically, mach pixels using photo consistency, treat co lusion and visibility problems and we consider a spatial regularization term which preserves discontinuities. The details of the graph construction as well as the proof of the correctness of the method are given. Moreover we introduce a multi-label refinement algorithm in order to increase the number of depth labels without signicantly increasing the computational complexity. Finally we compared our algorithm with the results available in the Middlebury database
By N. Papadakis and V.Caselles, BM |
|
| June, 2010 |
|
Title:
"Compensation of the afterglow phenomenon in 2-D discrete-time simulation" |
 |
|
Due to high computational costs, the physics governed by the wave equation in 3D is often modeled via discrete-time numerical simulations conducted in 2D scenarios. Results are normally generalized to 3D rather straightforwardly, overlooking the fact that the propagation of a point-like impulse in 2D exhibits the so-called afterglow phenomenon, which consists on the fact that non-null field values are measured after the arrival of the first wavefront. This paper analyzes the consequences of this phenomenon, both theoretically and numerically, and presents a simple method to compensate for it. Verification of this method is presented using various discrete-time numerical schemes.
By T. Mateos, J. Escolano, C. Spa, A. Garriga, BM |
|
| July 5 , 2010 |
|
Title:
"Evaluating Detection of Near Duplicate Video Segments" |
 |
|
(...)In this paper we have implemented several evaluation measures found in literature and we apply them to real algorithm outputs and a simulated result data set. We then calculate the correlation between the results obtained with the dierent measures in order to investigate whether they can be compared or not. The results show that the correlation between the measures is some cases quite low, and some measures are especially sensitive to certain types of deviations from the ground truth.
By Werner Bailer, JRS |
|
| July 25 , 2010 |
|
Title:
"2020 3D Media: New directions in Immersive Entertainment" |
 |
|
(Poster abstract) This research project is conducted by a consortium of European industrial and academic partners that include companies like: Technicolor, Digital Projection, DTS Europe, Doremi, Mediapro, Creative Wokers (CREW) and Datasat, and research centers: Barcelona Media, Joaneum research, University of Hasselt, University of Reading and Fraunhoffer. It is aimed to research, develop and demonstrate novel forms of compelling entertainment experiences based on new technologies for the capture, production, networked distribution and display of three-dimensional sound and images.
By S. Fort, BM |
|
| August , 2010 |
|
Title:
"Spatial string matching for image classification" |
 |
|
This paper presents a spatial string matching
method to incorporate spatial information into the bag-ofwords model, which represents an image as an unordered
distribution of local features. Spatial constraints among
neighboring features are explored in order to achieve
better discrimination power for image classification. The
features from neighboring points are combined together
and taken as a spatial string, and then our method
matches the images according to the similarity of string
pairs. The categorization problem can be formulated
using KNN or SVM classifier based on the spatial string
matching kernel. The proposed method is able to capture
spatial dependencies across the neighboring features.
Experiment results show promising performance for
image classification tasks.
By Yunqiang Liu and Vicent Caselles, BM |
|
| August 23 , 2010 |
|
Title:
"Stereoscopic image inpainting: distinct depth maps and images inpainting" |
 |
|
In this paper we propose an algorithm for in painting of stereo images. The issue is to reconstruct the holes in a pair of stereo image as if they were the projection of a 3D scene. Hence, the reconstruction of the missing information has to produce a consistent visual perception of depth. Thus, first step of the algorithm consists in the computation and in painting of disparity maps in the given holes. The second step of the algorithm is to fill-in missing regions using the complete disparity maps in a way that avoids the creation of 3D artifacts. We present some experiments on several pairs of stereo images.
By Alexandre Hervieu, Nicolas Papadakis, Aurélie Bugeau, Pau Gargallo, Vicent Caselles, BM |
|
| September 9 , 2010 |
|
Title:
"A New Infrastructure for High Resolution 3D and Slow Motion Production-Sets Featuring the 'FlashPak II' Field-Recorder Demonstrator" |
 |
|
The current success of immersive 3D experiences in feature films and the resulting move to push such technology into TV call for the extension of today's standard digital HD or 2K/4K capture process towards stereo imaging or other means of depth representation. Recently, stereoscopic high resolution capture rigs are employing proprietary workflows and a non-unified infrastructure based on legacy technology taken from the 2D domain. As this sudden demand for at least doubled recording bandwidth and storage space are clearly stretching the practicability limits of this legacy infrastructure, the already overdue paradigm shift towards a modern, future proof capture and recording infrastructure is getting another push. This paper describes a comprehensive infrastructure approach based on new interface and field recording technologies for capture devices that targets the challenging requirements of today's and future 3D production setups. The new interface approach harmonizes and consolidates image data, depth information and general metadata to ease 3D production workflows. Furthermore, a handy field recorder solution based on solid state technology is introduced that is able to record up to seven uncompressed camera streams in parallel.
By T. Brune and J.P. Wittenburg, Technicolor |
|
| September 20 , 2010 |
|
Title:
"CUDA Acceleration of Color Histogram Matching" |
 |
|
A common approach to histogram matching is done by means of the cumulative distribution functions (CDFs). First, we calculate their normalized histograms (hA and hB) and their respective cumulative histogram distribution functions(CDFA and CDFB). Then, a matching between the CDFs are performed. Given the reference CDF, CDFA, for each gray level GI we find the corresponding gray level GJ in which CDFA(GI) = CDFB(GJ), if I and J correspond to different values then a replacement of a gray level in the image IB is performed in order to match their CDFs. Our approach considers the ideas of [Rolland et al. 2000] with a Nvidia 3D broadcast solution system using professional HD cameras.
By Antonio S. Montemayor, Raúl Cabido, Juan José Pantrigo, URJC, and Xavi Rodríguez, Sergi Sagàs, Mediapro |
|
| September 26 , 2010 |
|
Title:
"Polyconvexification of the multi-label optical flow problem" |
 |
|
In this paper the problem of optical flow and occlusion mask
estimation is aborded. To that end, we consider a multi-label
representation of the optical flow and we define an energy that
models the problem. The convexification of the energy and
its minimization with an iterative algorithm are studied. Our
algorithm is implemented in GPU, since each pixel can be
processed in parallel. From our experiments, the relation between
the quality of the results obtained and computing time
seems to be very promising.
By N. Papadakis, A. Baeza, P. Gargallo, BM and V. Caselles, UPF |
|
| October, 2010 |
|
Title:
"A Comprehensive Framework for Image Inpainting" |
 |
|
Inpainting is the art of modifying an image in a form that is not detectable by an ordinary observer. There are numerous and very different approaches to tackle the inpainting problem, though as explained in this paper, the most successful algorithms are based upon one or two of the following three basic techniques: copy-and-paste texture synthesis, geometric partial differential equations (PDEs), and coherence among neighboring pixels. We combine these three building blocks in a variational model, and provide a working algorithm for image inpainting trying to approximate the minimum of the proposed energy functional. Our experiments show that the combination of all three terms of the proposed energy works better than taking each term separately, and the results obtained are within the state-of-the-art.
By A. Bugeau, M. Bertalmio, V. Caselles, G. Sapiro, BM, UPF |
|
| November 17 , 2010 |
|
Title:
"Tracking and Clustering Salient Features in Image Sequences" |
 |
|
Many applications in media production need information about moving objects in the scene, e.g. insertion of computer-generated objects, association of sound sources to these objects or visualization of object trajectories in broadcasting. We present a GPU accelerated approach for detecting and tracking salient features in image sequences and we propose an algorithm for clustering the obtained feature point trajectories in order to obtain a motion segmentation of the set of feature trajectories. Evaluation results for both the tracking and clustering steps are presented. Finally we discuss the application of the proposed approach for associating audio sources to objects to support audio rendering for virtual sets.
By Jakub Rosner, Silesian University and Werner Bailer, Hannes Fassold, Felix Lee, JRS |
|
| November , 2010 |
|
Title:
"Virtual Camera Synthesis for Soccer Game Replays" |
 |
|
In this paper, we present a set of tools developed during the creation of a platform that allows the automatic generation of virtual views in a live soccer game production. Observing the scene through a multi-camera system, a 3D approximation of the players is computed and used for the synthesis of virtual views. The system is suitable both for static scenes, to create bullet time effects, and for video applications, where the virtual camera moves as the game plays.
By N. Papadakis, A. Baeza, X. Armangué, I. Rius, A. Bugeau, O. D'Hondt, P. Gargallo, V. Caselles and S. Sagas, BM, Mediapro |
|
| November , 2010 |
|
Title:
"Unsupervised Motion Layer Segmentation by Random Sampling and Energy Minimization" |
 |
|
In this paper we introduce an unsupervised scheme for the segmentation of motion layers in video sequences. The number of layers is automatically determined by the method. Our approach first extracts the motion models thanks to a RANSAC-based random sampling algorithm improved by the use of geodesic distance information. Then those models are assigned to pixels in the color image by minimizing an energy functional thanks to graph-cut. Our energy takes into account motion residuals, color distributions, geodesic distance as well as temporal consistency of the layers. Moreover, we define a smoothness term that enforces a patch-wise spatial coherency on areas where optical flow is reliable and a pixel-wise coherency on occluded areas. The method leads to promising results on the tested sequences.
By Olivier D'Hondt and Vicent Caselles, BM |
|
| November 17 , 2010 |
|
Title:
"A Comprehensive Infrastructure and Workflow for Acquisition of High-resolution Multi-view Content" |
 |
|
The current success of immersive 3D experiences in feature films and the trend towards 3D TV require advanced tools and workflows for high-quality capture of multi-view live scenes including depth information. These requirements are not fulfilled by today's capture workflows and infrastructures based on legacy technology from 2D capture solutions. We propose a comprehensive infrastructure for capture of multiview content supporting different methods to obtain depth information. The paper describes a device for field capture and tools for online and offline content analysis and browsing of the captured content that have been integrated into the proposed infrastructure.
By Werner Bailer, Christian Schober, JRS and Thomas Brune, Technicolor |
|
| December 28, 2010 |
|
Title:
"A Variational Framework for Exemplar-Based Image Inpainting" |
 |
|
In this paper we propose an algorithm for inpainting
of stereo images. The issue is to reconstruct the holes in
a pair of stereo image as if they were the projection of a
3D scene. Hence, the reconstruction of the missing information
has to produce a consistent visual perception
of depth. Thus, first step of the algorithm consists in
the computation and inpainting of disparity maps in the
given holes. The second step of the algorithm is to fill-in
missing regions using the complete disparity maps in a
way that avoids the creation of 3D artifacts. We present
some experiments on several pairs of stereo images.
By Pablo Arias, Gabriele Facciolo, Guillermo Sapiro, Vicent Caselles, BM |
|
| January 4, 2011 |
|
Title:
"Stereoscopic Image Inpainting Using Scene Geometry" |
 |
|
In this paper we propose an algorithm for inpainting
of stereo images. The issue is to reconstruct the holes in
a pair of stereo image as if they were the projection of a
3D scene. Hence, the reconstruction of the missing information
has to produce a consistent visual perception
of depth. Thus, first step of the algorithm consists in
the computation and inpainting of disparity maps in the
given holes. The second step of the algorithm is to fill-in
missing regions using the complete disparity maps in a
way that avoids the creation of 3D artifacts. We present
some experiments on several pairs of stereo images.
By Alexandre Hervieu, Nicolas Papadakis, Aurélie Bugeau, Pau Gargallo, Vicent Caselles, BM |
|
| January 15 , 2011 |
|
Title:
"Video Browsing Using Object Trajectories" |
 |
|
Video browsing methods are complementary to search and retrieval approaches, as they allow for exploration of unknown content sets. Objects and their motion convey important semantics of video content, which is relevant information for video browsing. We propose extending an existing video browsing tool in order to support clustering of objects with similar motion and visualization of the objects’ positions and trajectories. This requires the automatic extraction of moving objects and estimation of their trajectories, as well as the ability to group objects with similar trajectories. For the first issue we describe the application of a recently proposed motion trajectory clustering algorithm, for the second we use k-medoids clustering and the dynamic time warping distance. We present evaluation results of both steps on real world traffic sequences from the Hopkins155 data set. Finally we describe the description of analysis results using MPEG-7 and the integration into the video browsing tool.
By Felix Lee and Werner Bailer, JRS |
|
|