Your browser does not support JavaScript!

Home    Exploring real-time data analytics using distributed stream processing systems  

Results - Details

Add to Basket
[Add to Basket]
Identifier 000438731
Title Exploring real-time data analytics using distributed stream processing systems
Alternative Title Διερεύνηση τεχνικών ανάλυσης δεδομένων σε πραγματικό χρόνο με χρήση κατανεμημένων συστημάτων επεξεργασίας ροών
Author Μποφίλη Αρβανίτη, Ιωάννα Μαρία
Thesis advisor Μαγκούτης, Κωνσταντίνος
Reviewer Πλεξουσάκης, Δημήτρης
Πρατικάκης, Πολύβιος
Abstract Processing high-volume streaming data is an important enabling technology in IoTdriven, social networking, and other e-services, having given rise to a new generation of stream-processing systems (SPS). In this thesis, we apply modern SPS technologies to improve the state of the art in two application areas involving real-time position tracking in physical space: maintaining profiles of visitor movement in exhibit spaces and predicting service-level objective violations in mass transit systems. To ensure that scalable SPSs are able to seamlessly adapt to varying levels of load by adjusting their resources, in this thesis we implement a mechanism by which SPSs can scale dynamically even when such capability is not natively supported by the SPS and the underlying resource management platform. Our first application of SPS technologies to real-time data analytics is on the development of dynamic behavioral profiles of visitors in exhibit spaces based on their movements in physical space. While related approaches have been explored in the past, this thesis applies for the first time stream-processing technologies to materialize behavioral theories developed in social sciences and to collect richer information about visitors' interests. Such profiles can be used to produce recommendations for the visitors about exhibits they should visit, to decide the best content to present to them, or to design personalized questionnaires. Our second application of SPS technologies to real-time data analytics is on training appropriate models for predicting mass-transit vehicles that are likely to violate service-level objectives in their route duration. In this thesis we extend previous delay prediction techniques with the ability to apply predictions in a real-time fashion. In this thesis we also address the necessary tuning and support for scalable, adaptive data analytics by examining various parameters of the SPS and its ingest engine. Even in cases where dynamic scaling is not explicitly supported by the SPS platform, we demonstrate a technique that achieves scale-out with low downtime.
Language English
Subject Big data
Data analysis
Spark structured streaming
Streaming data
Δεδομένα μεγάλου όγκου
Ροές δεδομένων
Issue date 2021-03-26
Collection   School/Department--School of Sciences and Engineering--Department of Computer Science--Post-graduate theses
  Type of Work--Post-graduate theses
Views 652

Digital Documents
No preview available

Download document
View document
Views : 16