Results - Details
Search command : Author="Φλουρής"
And Author="Γεώργιος"
Current Record: 13 of 22
|
Identifier |
000406891 |
Title |
Exploring importance measures for summarization on graph databases |
Alternative Title |
Εξερεύνηση μέτρων σημαντικότητας για δημιουργία συνόψεων σε βάσεις δεδομένων γράφων |
Author
|
Παππάς, Αλέξανδρος Σ.
|
Thesis advisor
|
Πλεξουσάκης, Δημήτρης
|
Reviewer
|
Γεωργακόπουλος, Γεώργιος
Φλουρής, Γεώργιος
|
Abstract |
The real world is richly interconnected. As such the natural properties of graphs, render them extremely useful in modeling real world, understanding a wide diversity of data-sets and offering applied solutions in different fields of industry.
A graph database is an on-line, operational database management system with Create, Read, Update, and Delete
(CRUD) methods that expose a graph data model. Alternative to traditional relational databases, graph databases are being optimized and designed predominantly for graph workloads, traversal performance and executing graph algorithms on complex hierarchical structures.
Given the explosive growth in the size and the complexity of the Data Web, it is estimated that by the end of 2018, 70% of leading organizations will have one or more utilizing graph
databases. Triple stores are a subcategory of graph databases, modeled around the Resource Description Framework (RDF) specifications and designed as labeled, directed multi-graphs.
To this direction, there is now more than ever, an increasing need to develop methods and tools in order to facilitate the understanding and exploration of RDF/S Knowledge
Bases (KBs). Given the fact that the human brain can only interpret at most a few hundred nodes in one chart it becomes obvious that current data size and schema complexity are far beyond the exploration capability that any automated layout can provide.
Summarization approaches try to produce an abridge
d version of the original data source, highlighting the most representative concepts. Central questions to summarization are: how to identify the most important nodes and then how to link them in order to produce a valid sub-schema graph. In this thesis, we try to answer the first question by revisiting several measures covering a wide range of alternatives for selecting the most important nodes and adapting them for RDF/S KBs. Then, we proceed further to model the problem of linking those nodes as a graph Steiner-Tree problem (GSTP). Since the GSTP is NP-complete, we explore three approximations (SDIST, CHINS and HEUM) employing heuristics to speed up the execution of the respective algorithms. Our detailed experiments show the added value of our approach since a) our adaptations outperform current state of the art measures for selecting the most important nodes and b) the constructed summary has a better quality in terms of the additional nodes introduced to the generated summary as GSTP approximations outperform past approaches.
|
Language |
English |
Issue date |
2017-03-17 |
Collection
|
School/Department--School of Sciences and Engineering--Department of Computer Science--Post-graduate theses
|
|
Type of Work--Post-graduate theses
|
Permanent Link |
https://elocus.lib.uoc.gr//dlib/1/e/8/metadata-dlib-1488542601-358182-9964.tkl
|
Views |
730 |