Results - Details
Search command : Author="Στεφανίδης"
And Author="Κωνσταντίνος"
Current Record: 1 of 75
|
Identifier |
000463833 |
Title |
Achieving total 3D human capture with MocapNETs |
Alternative Title |
Ολική εκτίμηση της τρισδιάστατης ανθρώπινης πόζας με MocapNETs |
Author
|
Qammaz, Ammar
|
Thesis advisor
|
Αργυρός, Αντώνης
|
Reviewer
|
Στεφανίδης, Κωνσταντίνος
Ζαμπούλης, Ξενοφώντας
Τραχανιάς, Παναγιώτης
Κομοντάκης, Νικόλαος
Daniilidis, Kostas
Vincze, Markus
|
Abstract |
The goal of the presented thesis was to investigate and develop a novel, fast, portable, robust and accurate
plug and play 3D Human Capture module that receives RGB images captured in-the-wild and regresses
the 3D body configuration of any depicted person in the scene. The proposed architecture was built from
scratch using first principles and taking advantage of recent advancements in Neural Networks, taking its
final form as an ensemble of neural networks. We identified and bridged gaps between state-of-art deep
learning methods and well-established model-based vision methodologies predating CNNs. Its name, “MocapNET” was coined to concisely describe it as it became the first neural network-based method in the literature to directly regress Motion Capture (Mocap) output in an end-to-end fashion. To improve accuracy
and address personalization aspects, a novel real-time generative optimization algorithm was also developed named “Hierarchical Coordinate Descent” and tailored to the conditionally independent encoders of
the MocapNET ensemble complementing their output. The ambition and scope of the retrieved 3D output gradually broadened as the method successfully generalized to more articulated structures during the
course of its development. The total 3D capture solution presented includes upper body, lower body, hands,
face and gaze. With the term 3D Human Capture we refer not only to positions in a 3D space but rather, the
full kinematic solution of the skeleton. The method performs in real-time and its output is natively compatible with 3D editing software due to its BVH container. This makes it globally unique and among a select
very few methods that can successfully tackle all these sub-problems that traditionally were sub-fields of
the broader computer vision research. The 3D human pose estimation solution developed can be used in
devices such as mobile phones, AR/VR headsets, self-driving cars, smart devices, home and factory robots
etc, endowing them with capabilities to perceive, compare and enumerate human body poses, which would
ultimately facilitate understanding of human behavior. The thesis attempts to carefully document all the aspects of the method including 2D shape descriptors, NN design, PCA compression to allow usage on mobile
devices and the various attempts that shaped the method to its final version.
|
Language |
English |
Subject |
3D human pose estimation |
|
Hierarchical coordinate descent (HCD) |
|
Holistic motion capture |
|
Inverse kinematics (IK) |
|
Neural network ensemble |
|
Αντίστροφη κινηματική |
|
Εκτίμηση 3Δ πόζας ανθρώπινου σώματος |
|
Ιεραρχική κάθοδος συντεταγμένων |
|
Ολιστική σύλληψη κίνησης |
|
Σύνολα νευρωνικών δικτύων |
Issue date |
2024-07-26 |
Collection
|
School/Department--School of Sciences and Engineering--Department of Computer Science--Doctoral theses
|
|
Type of Work--Doctoral theses
|
Permanent Link |
https://elocus.lib.uoc.gr//dlib/c/4/9/metadata-dlib-1712309838-104172-5512.tkl
|
Views |
1615 |