Post-graduate theses
Current Record: 742 of 824
|
Identifier |
uch.csd.msc//1997sintichakis |
Title |
Συγχώνευση Μονόγλωσσων Θησαυρών Θεματικών Όρων |
Alternative Title |
Monolingual thesauri merging |
Creator |
Sintichakis, Marios
|
Contributor |
Π. Κωνσταντόπουλος
|
Abstract |
A thesaurus is a conceptual structure representing concepts from a particular domain of discourse using a controlled vocabulary and three types of relationships between concepts: equivalence, hierarchical and associative relationships. It has been proven in practice that thesauri can play an essential role as parts of information retrieval systems. Yet, the construction of thesauri is an exceptionally hard and time con\-sum\-ing task. Within the context of this thesis, we deal with the problem of monolingual thesauri merging as a means for thesauri construction and development. The aim of thesauri merging is the integration of both vocabularies and relationships of two or more thesauri, so as to produce a new thesaurus which describes more concepts than each one of the merging thesauri do. We try to systematically address the problem: we introduce a set-theoretic frame\-work for the rep\-re\-sen\-ta\-tion of thesauri and the relevant integrity constraints which are independent of any implementation issues. We decompose the merging process in four phases namely pre-integration, analysis, conflict detection and resolution and integration. Our attention is mainly focused to the phase of analysis which is aimed at the detection of terms of the merging thesauri which ascribe the same concept. In order to detect terms ascribing the same concept in different thesauri, we introduce a model for the computation of conceptual distance between terms. This model is an adaptation of a more general model of analogical similarity and it is based on the relationships between terms. Moreover we combine conceptual term distances with lexical similarity and equivalence relationships, so as to reduce the size of the term sets which should be considered. The merging is a top-down procedure based on the topological sorting imposed by the directed acyclic graph which is formed by the hierarchical relationships of the merging thesauri. This policy has the advantage that it tends to gradually improve the accuracy of the conceptual distances computed at each level. The mode of operation can be either batch or interactive. We have used the {\sf Telos} data model of the Semantic Index System for the rep\-re\-sen\-ta\-tion, storage and management of thesauri. Using {\sf SIS} as a platform, we have implemented a prototype of our method in {\sf C++}. We have conducted a merging experiment trying to integrate two well-known thesauri: the ``Computing Reviews Classification System'' and the ``Library of Congress Subject Headings''. The first results were quite encouraging and we believe that slight modifications can further improve our method.
|
Subject |
α) Δίκτυα Υπολογιστών και Ψηφιακές Επικοινωνίες, β) Πληροφοριακά Συστήματα και Τεχνολογία Λογισμικού |
Issue date |
1997-03-01 |
Date available |
1997-06-2 |
Collection
|
School/Department--School of Sciences and Engineering--Department of Computer Science--Post-graduate theses
|
|
Type of Work--Post-graduate theses
|
Permanent Link |
https://elocus.lib.uoc.gr//dlib/8/f/c/metadata-dlib-1997sintichakis.tkl
|
Views |
433 |