Abstract |
Due to the advent of digital TV and the availability of large video databases the task
of automatic video classification has received a great research interest. The objective
of video classification is to label a video sequence with its corresponding class, among
a predefined set of classes. Typically, full resolution video data is required for the
extraction of appropriate features. However, under the case of limited-resource sensing
systems, which happens in applications like video surveillance and remote sensing such processing can be computationally and power demanding placing significant burden
on the encoder's side. Additionally, a large bandwidth is required to transmit full-resolution data at a base station for further processing.
In this thesis we address the aforementioned problems by exploiting the framework
of compressive sensing. Compressive sensing acting simultaneously as a sampling and
compression protocol enables the efficient representation and reconstruction of a sparse
signal from a set of non-adaptive linear incoherent measurements much fewer than
what is described by the Nyquist theorem. Here, we exploit the properties of linear
random projections for addressing the problem of video classification without handling
the original high-resolution data. In particular, we introduce two compressive video
classification systems that work directly in the compressed domain. We assume the
scenario of a video classification system equipped with a single-pixel camera that can
directly acquire compressive samples in the optical domain.
In the first system the compressively sampled frames are directly used as features
along with an appropriate decision rule to classify a query sequence. In the second
system a block-based compressive acquisition model is used together with dictionary
learning, and a support vector machine (SVM) with a spatio-temporal pyramid matching kernel for the classification phase. The proposed methods are evaluated using a
subset of the UCF50 activity recognition dataset. The results verify the efficiency of
the proposed video classification systems and illustrate that features based on compressive measurements, in conjunction with an appropriate decision rule, results in an effective video classification scheme, which meets the constraints of systems with limited
resources. In addition, the comparison with a conventional video classification scheme
that exploits the full-resolution video data illustrates that, although only a small per-
centage of the original data is used in the compressive video classification systems, no
significant degradation in performance is observed.
|