E-Locus - Institutional Repository of the University of Crete - Indexes and algorithms for measuring the connectivity of Linked Data

Home Indexes and algorithms for measuring the connectivity of Linked Data

Results - Details

[Add to Basket]

Identifier

000402041

Title

Indexes and algorithms for measuring the connectivity of Linked Data

Alternative Title

Ευρετήρια και αλγόριθμοι για τη μέτρηση του βαθμού διασύνδεσης των διασυνδεδεμένων δεδομένων

Author

Μουνταντωνάκης, Μιχαήλ Ε.

Thesis advisor

Τζίτζικας, Ιωάννης

Reviewer

Πλεξουσάκης, Δημήτριος
Μαγκούτης, Κωνσταντίνος

Abstract

Linked Dat a is a method for publishing structured data that allows them to be interlinked (by using URIs instead of simple values ) for assisting their integration . A big numb er of such datasets , hereafter sources , has already been published according to the principles of Linked data and their number and size keeps increasing . However, currently it is not evident how connected these datasets are . In particular, it is difficult (a) to obtain complete information about one partic ular URI (or a set of URIs), (b) to discover a dataset which is relevant to another one , (c) to comput e and visualize the degree of connectivity between two or more d a t a sets. All the aforementioned tasks are important for the integration process in an open and involving environment To alleviate this problem in this thesis , we introduce metrics, indexes and algorithms which allow the computation and quan tification of connectivity among several datasets . For achieving scalability , we propose (i ) a namespace - based prefix index, (ii) a sameAs catalog for computing the symmetric and transitive closure of the sameAs relationships encountered in the datasets, (iii) a semantics - aware element index (that explo its the aforementioned indexes), (iv) a lattice of the common elements of any set of datasets , and (v) two lattice - based incremental algorithms for sp eeding up the computation of the lattice . We apply and evaluate the proposed approach in the context of a real and operational semantic warehouse containing information about the marine domain (where the metrics are used for assessing the quality of the semantic warehouse and its underlying sources, and for monitoring the quality of the semantic warehouse aft er a reconstruction ), as well as for three hundred LOD cloud datasets. We report measurements that have not been carried out in the past ( like the number of common URIs among three or more datasets, the frequency of prefixes, i.a. ), we offer novel service s ( like finding equivalent URIs, find the most relevant datasets for a specific dataset, i.a. ) and finally we discuss the speedup obtained by the proposed indexes and algorithms. Finally, we propose an extension of the VoID ontology for publishing , sharing and exploiting such measurements.

Language

English

Subject

Connectivity

Lattice warehouse

Metrics

Ontology

Semantic web

Warehouse

Διασυνδεδεμένα δεδομένα

Μέτρα ποιότητας

Οντολογία