Abstract |
As digital cameras become cheaper, multi-camera systems or camera networks are
becoming commonplace. Calibrated multi-view setups are associated with some strong
assumptions and their intrinsic/extrinsic calibration is a tedious process. Nevertheless,
their ability to reduce occlusion effects and appearance ambiguities lead to more robust
computer vision algorithms, a fact that typically outweighs their disadvantages. A great
number of computer vision applications employ such setups in order to acquire rich 3D information
regarding the environment in which they operate. These applications typically
have real-time performance requirements. Past approaches on real-time multi-view 3D
reconstruction employed expensive special purpose hardware and/or powerful mainframes
to achieve real-time performance at the required quality.
The goal of this work is the high quality, real-time 3D reconstruction of a scene based
on visual input provided by a multicamera system. To achieve high quality in reconstruction
we propose a novel algorithm for optimizing the parameters of the underlying
foreground segmentation process. Furthermore, to meet the real-time performance requirements,
we propose and implement a complete GPU reconstruction pipeline whose
input is colored multi-frames and output is textured 3D meshes. This is in contrast
to existing shape-from-silhouette GPU-based approaches where the input is binary foreground
images, typically processed by the host's CPU, and their output is a volumetric
representation of the visual hull of the scene.
The contributions of this thesis include (a) a novel algorithm for unsupervised learning
of optimal foreground detection parameters in multi-camera systems, (b) the implementation
for GPU execution of a complete 3D reconstruction pipeline that includes novel parallelizations
of popular foreground segmentation and 3D reconstruction algorithms along
with parallel implementations of common graphics algorithms, and (c) the design and realization
of a scalable architecture implementing a physical multi-view 3D reconstruction
system along with the deployment of it in real-world computer vision applications.
Extensive experimental results con_rm the e_ectiveness of the adopted approach for
learning the parameters of foreground detection. The performance analysis of the proposed
multi-view system also demonstrates that an accurate, high resolution texturemapped
3D reconstruction of a scene observed by eight cameras is achievable in real-time
with a single GPU. Comparisons against the state-of-the-art in GPU-powered 3D reconstruction
on a standard dataset show that the proposed system outperforms most
of the competition. Finally, the deployment of the proposed 3D reconstruction system
in real-world applications (Archaeological Museum of Thessaloniki, permanent exhibition
`Macedonia: from fragments to pixels' ) provides strong evidence on its robustness,
effciency and effectiveness.
|