E-Locus - Institutional Repository of the University of Crete - Use of unsupervised word classes for entity recognition. Application to the detection of disorders in clinical reports

Home Use of unsupervised word classes for entity recognition. Application to the detection of disorders in clinical reports

Results - Details

[Add to Basket]

Identifier

000381358

Title

Use of unsupervised word classes for entity recognition. Application to the detection of disorders in clinical reports

Author

Χατζημηνά, Μαρία Ευαγγελία

Thesis advisor

Zweigenbaum, Pierre

Select a value

Allauzen, Alexandre
Lavergne, Thomas

Abstract

Natural language processing (NLP) is the branch of computer science focused on developing systems that allow computers to communicate with people using everyday language. Many NLP techniques, including stemming, part of speech tagging, named entity recognition, compound recognition, de-compounding, chunking, word sense disambiguation and others, have been used for information extraction. In many cases, semantic information is used to expand knowledge about documents and to improve performance. There is an increasing interest in NLP strategies applied to clinical texts due to the increasing number of electronic documents in hospital information systems. Biomedical text mining is a research field on the edge of natural language processing and refers to text mining applied to clinical text or to the literature of the biomedical domain. In this work, we present a methodology which combines unsupervised word classes with supervised machine learning methods in order to contribute to named entity recognition on clinical reports. Named entity recognition is performed generally by knowledge-based semantic resources. We present an approach where data-driven word classes are evaluated and compared with knowledge-based semantic classes when inserted as features in a Conditional Random Field (CRF) classifier. We examine different methods to combine datadriven word classes with knowledge-based semantic classes to improve named entity recognition. Data-driven semantic classes achieve results with small differences compared to knowledge-based semantic classes. Our case study concluded that data-driven word classes can add important information and are complementary with knowledge-based semantic classes.

Language

English

Issue date

2013-11-15

Collection

School/Department--School of Sciences and Engineering--Department of Computer Science--Post-graduate theses

Type of Work--Post-graduate theses

Views

362

Digital Documents
	Download document View document Views : 6