E-Locus - Institutional Repository of the University of Crete - A computational framework for observing and understanding the interaction of humans with objects of their environment

Home A computational framework for observing and understanding the interaction of humans with objects of their environment

Results - Details

[Add to Basket]

Identifier

000391347

Title

A computational framework for observing and understanding the interaction of humans with objects of their environment

Alternative Title

Ένα υπολογιστικό πλαίσιο για την παρατήρηση και κατανόηση της αλληλεπίδρασης ανθρώπων με αντικείμενα του περιβάλλοντός τους.

Author

Κυριαζής, Νικόλαος

Thesis advisor

Αργυρός, Αντώνης

Reviewer

Τσαμαρδινός, Ιωάννης
Λουράκης, Μανόλης
Τραχανιάς, Πάνος
Παπαγιαννάκης, Γιώργος
Δανηιλίδης, Κώστας
Beetz, Michael

Abstract

We focus on the problem of vision-based scene understanding, i.e. “lifting” a scene which is observed by visual means across time, to a symbolic representation that can be processed by a computational system. We are interested in dynamic indoor scenes, in which humans purposefully interact with their environment. We observe that existing approaches have been performing scene understanding mainly through coarse modelling of the observed processes, as more detailed modelling is very demanding in terms of computational resources and exhibits difficulties with respect to the required integration of computer vision methods. We suggest that currently, it is indeed feasible to incorporate detailed scene modelling, which can be easily integrated with computer vision techniques and can efficiently cope with the associated computational requirements. With respect to scene understanding, we are in position to model and simulate the process of image acquisition through 3D rendering (appearance), and the dynamics of the observed processes through physics simulation (behavior). Thus, we identify 3D rendering and physics simulation as two significant processes towards scene understanding. We propose the combination of the simulation power of these tools with powerful optimization methods, in order to yield powerful inference tools towards scene understanding. More specifically, we consider the process of scene understanding as an optimization problem. We design parametric models that describe what can take place in a dynamic scene and how this can be observed by visual means. We define these parameters to constitute the domain of the optimization problem. Optimization is decoupled from modelling and is performed in a hypothesize-and-test framework which is implemented based on black box optimization techniques. The outcome of the optimization is the instance of the parametric models which best “explain” the observations. Ultimately, in the context of this work, the tested hypotheses are in agreement with laws of physics as they originate from physics simulators. For every hypothesis, its compatibility with actual observations of the scene is evaluated through 3D rendering. Thus, our proposal focuses on three points: (a) forward modelling of the scene, (b) incorporation of physics simulation and (c) exploitation of black-box optimization methods. We have developed a computational framework which, based on the above, performs aspects of 3D scene understanding. We present this framework and its application to the problems of 3D tracking and motion estimation. We emphasize the necessity for the incorporation of physics. More specifically, we show that by acknowledging that visual observations regard physical phenomena governed by laws of physics, we can even apply inference on initially “hidden” parameters. More specifically, we can estimate parameters that prior to incorporating physics were not directly observable, and which can be recovered only by attributing observations to side-effects of physical processes. The proposed computational framework has been employed to solve problems that vary from tracking a single object to tracking two hands while interacting with many objects, in 3D and from different visual modalities and camera arrangements. Through a series of experiments we show how important it is to incorporate computer graphics and physics processes in 3D scene understanding. These processes were successfully used as black box simulation tools and their inherent complexity has not hindered the integration with computer vision processes, thanks to the design choice of employing black-box optimization. We were also able to show that the proposed framework exhibits a favorable scalability profile when applied to domains of increasing complexity. Through careful design, the invocation of otherwise expensive simulations can be performed so efficiently that interactive processing frame rates are achieved. All the above advocate a modular computational solution to 3D scene understanding problems with a clear potential for improvement or generalization: substituting parts with better or more general modules automatically improves the entire framework.

Language

English

Subject

3Δ

Graphics

Hand

Multiple

Object

Optimization

Physics

Αντικείμενο

Βελτιστοποίηση

Γραφικά

Πολλαπλά

Φυσική

Χέρι

Issue date

2014-11-04

Collection

School/Department--School of Sciences and Engineering--Department of Computer Science--Doctoral theses

Type of Work--Doctoral theses

Views

635

Digital Documents
	Download document View document Views : 45