Your browser does not support JavaScript!

Home    Collections    Type of Work    Post-graduate theses  

Post-graduate theses

Current Record: 4622 of 4911

Back to Results Previous page
Next page
Add to Basket
[Add to Basket]
Identifier uch.csd.msc//2000christofis
Title Σύστημα Εξαγωγής γνώσης από κατανεμημένες και ετερογενείς βάσεις δεδομένων: Εφαρμογή σε ιατρικά πληροφοριακά συστήματα
Alternative Title Knowledge discovery from distributed and heterogeneous databases: A clinical information systems application
Creator Christofis, Constantinos A
Abstract With the current explosion of data, the problem of how to combine distributed and heterogeneous information sources becomes more and more critical. Besides collecting enormous amount of data it is very important to consider the general need of semantic integration and knowledge discovery from these sources, an important and necessary challenge for machine learning- ML, and data mining/knowledge discovery- DM/KDD researchers. If the distributed nature of data has a more-or-less clear definition (even hard, and most of the times tedious to achieve), heterogeneity is a more complex concept. The real issue here is not only how to access specific information systems that maintain the data, but also how to identify and index the essential information in them. A promising approach to this integration problem is to gain control of the organization's information resources at a meta-data level, while allowing autonomy of individual systems at the data instance level. This thesis presents the problem of discovering and acquiring knowledge from distributed and heterogeneous data sources. The main challenge is 'how data mining and machine-learning operations are adapted and made operational in such a distributed and heterogeneous environment'. To this end, a multi-phase data integration procedure is proposed and implemented, which: (1) efficiently accesses structured and distributed data sources; (2) reliably homogenize and integrate the stored heterogeneous data (with a dedicated domain ontology and respective ontological operations playing a crucial role); (3) effectively data processing operations such as, traditional statistical analysis, data mining, etc; and (4) presentation of results (e.g., visualization operations). The integration approach is realized by the coupling of multi-disciplinary technologies ranging from, CORBA based seamless access to distributed data, to semantic data homogenization operations- based on the appropriate utilization of a domain specific data models and ontology, and to advanced DTD/XML operations. These operations- coupled with advanced and effective data representation models, forms a framework in which efficient and effective ML/KDD operations are performed. The fundamental contribution of our work is the incorporation and customization of KDD/ARM (Association Rules Mining) operations on top of appropriately generated DTD/XML documents. In particular, we tackle the problem of inducing interesting associations between data items stored in remote clinical information systems. The test-bed environment of our approach and implementation is the HYGEIAnet: The Integrated Health Care Network of Crete. The presented work expands the HYGEIAnet reference architecture by adding: (a) the information and data semantic indexing operations, (b) the generation of DTD/XML documents to represent and store data- coupled with ontological operations to semantically homogenize the data, (c) the object-oriented data structuring schemas and operations, and (d) the adaptation of KDD operations- realized by a specially devised Associations Rule Mining algorithm- named AproriXML. Based on the argument that, "future databases will use XML-like structures in order to store and retrieve data" then, the thesis presents a promising architecture and framework for hosting advanced and intelligent data processing operations in the emerging distributed and heterogeneous data and information environment.
Issue date 2000-07-01
Date available 2000-07-26
Collection   Faculty/Department--Faculty of Sciences and Engineering--Department of Computer Science--Post-graduate theses
  Type of Work--Post-graduate theses
Permanent Link Bookmark and Share
Views 236

Digital Documents
No preview available

View document

No preview available

View document
Views : 1