Your browser does not support JavaScript!

Home    Collections    Type of Work    Post-graduate theses  

Post-graduate theses

Current Record: 10 of 4817

Back to Results Previous page
Next page
Add to Basket
[Add to Basket]
Identifier 000428731
Title Keyword search over RDF using document-centric information retrieval systems
Alternative Title Αναζήτηση μέσω λέξεων-κλειδιών επί RDF δεδομένων χρησιμοποιώντας εγγραφοκεντρικά συστήματα ανάκτησης πληροφοριών
Author Καντηλιεράκης, Γιώργος Δ.
Thesis advisor Τζίτζικας, Γιάννης
Reviewer Πλεξουσάκης, Δημήτρης
Φλουρής, Γιώργος
Abstract There are thousands of datasets published according to the principles of Linked Data and Semantic Web. Many of those datasets, organized in RDF, are maintained either in crossdomain Knowledge Bases (e.g. DBpedia, Wikidata) or domain specific repositories (e.g. DrugBank, MarineTLO), and are mainly used through navigation and structured query languages like SPARQL. However these techniques are complex, lack flexibility and possibly require a full knowledge of the underlying ontology. As a result, these datasets are exploited by expert users only. On the other hand, keyword search is the most widely used method for searching. Keyword search is user friendly, offers instant content access, and keyword queries support a wide range of expression while being extremely flexible. Information Retrieval systems are designed for performing efficient keyword search in large data of information, usually organized as full text documents. There are various highly performant and effective state of the art search engines readily available. Such a search engine is Elasticsearch, a distributed full text search engine that provides scalable search over any kind of textual information. In this thesis we introduce an approach for keyword-search over RDF datasets, by adapting traditional IR techniques for both indexing and retrieval. Specifically, we test how a dominant IR engine such as Elasticsearch, can be adapted for indexing RDF data and enable keyword search. We provide a systematic analysis of different approaches to cope with the challenges of indexing and retrieving structured information and exploiting the graph capabilities of RDF. The response of the system comprises ranked RDF triples. We also provide policies for ranking the different entities that are contained in these triples, in order to support the requirements of entity search. We report evaluation results of the different approaches in terms of: (i) the efficiency of indexing and retrieval and (ii) the quality of retrieval. We test the effectiveness of our system by evaluating the relevance of the constructed entities against the DBpedia-Entity test collection, designed for entity search over the DBpedia KB and compare our results to various state of the art systems. Our results showcase the effectiveness of the proposed user friendly approach , that exploits the powerful features of scalable state of the art search engines, and can be applied in any RDF dataset, with no prior knowledge of the domain. The results show that Elasticsearch can effectively support keyword search over RDF data, offering effectiveness comparable to that of systems built from scratch for the task per se, that use entity-oriented and dataset-specic index structures.
Language English
Issue date 2020-03-27
Collection   Faculty/Department--Faculty of Sciences and Engineering--Department of Computer Science--Post-graduate theses
  Type of Work--Post-graduate theses
Permanent Link Bookmark and Share
Views 451

Digital Documents
No preview available

No permission to view document.
It won't be available until: 2021-03-27