Your browser does not support JavaScript!

Post-graduate theses

Search command : Author="Στεφανίδης"  And Author="Κωνσταντίνος"

Current Record: 3 of 723

Back to Results Previous page
Next page
Add to Basket
[Add to Basket]
Identifier 000434174
Title HInT: hybrid and incremental type discovery for large RDF data sources
Alternative Title HInT: υβριδική και αυξητική ανακάλυψη τύπων για μεγάλα RDF δεδομένα”
Author Καρδουλάκης, Νικόλαος Χ.
Thesis advisor Πλεξουσάκης, Δημήτρης
Reviewer Τζίτζικας, Ιωάννης
Kedad-Cointot, Zoubida
Abstract The rapid explosion of linked data has resulted into many weakly structured and incomplete data sources, where type declarations are completely or partially missing. On the other hand, type information is essential for a number of tasks such as query answering, integration, summarization and partitioning. Existing approaches for type discovery, either completely ignore type declarations available in the dataset (implicit type discovery approaches), or have to rely on partial availability of those types, in order to complement them (explicit type enrichment approaches). Implicit type discovery approaches are based on instance grouping, which requires an exhaustive comparison between the instances. This process is expensive and not incremental. Explicit type enrichment approaches on the other hand, can not process data sources that have little or no schema information. In this thesis, we present HInT, the first incremental and hybrid type discovery system for RDF datasets. It enables type discovery in datasets where type declarations are either partially available or completely missing. To achieve this goal, we incrementally identify the patterns of the various instances, we index and then group them to identify the types. During the processing of an instance, our approach exploits its type information, if available, to improve the quality of the discovered types by guiding the classification of the new instance in the correct group and by refining the groups already built. We analytically and experimentally show that our approach dominates in terms of effectiveness and most importantly efficiency, competitors from both worlds, implicit type discovery and explicit type enrichment.
Language English
Subject Incrementality
Locality sensitive hashing
Issue date 2020-11-27
Collection   Faculty/Department--Faculty of Sciences and Engineering--Department of Computer Science--Post-graduate theses
  Type of Work--Post-graduate theses
Permanent Link Bookmark and Share
Views 3

Digital Documents
No preview available

View document
Views : 1