Your browser does not support JavaScript!

Home    Use of unsupervised word classes for entity recognition. Application to the detection of disorders in clinical reports  

Results - Details

Add to Basket
[Add to Basket]
Identifier 000381358
Title Use of unsupervised word classes for entity recognition. Application to the detection of disorders in clinical reports
Author Χατζημηνά, Μαρία Ευαγγελία
Thesis advisor Zweigenbaum, Pierre
Select a value Allauzen, Alexandre
Lavergne, Thomas
Abstract Natural language processing (NLP) is the branch of computer science focused on developing systems that allow computers to communicate with people using everyday language. Many NLP techniques, including stemming, part of speech tagging, named entity recognition, compound recognition, de-compounding, chunking, word sense disambiguation and others, have been used for information extraction. In many cases, semantic information is used to expand knowledge about documents and to improve performance. There is an increasing interest in NLP strategies applied to clinical texts due to the increasing number of electronic documents in hospital information systems. Biomedical text mining is a research field on the edge of natural language processing and refers to text mining applied to clinical text or to the literature of the biomedical domain. In this work, we present a methodology which combines unsupervised word classes with supervised machine learning methods in order to contribute to named entity recognition on clinical reports. Named entity recognition is performed generally by knowledge-based semantic resources. We present an approach where data-driven word classes are evaluated and compared with knowledge-based semantic classes when inserted as features in a Conditional Random Field (CRF) classifier. We examine different methods to combine datadriven word classes with knowledge-based semantic classes to improve named entity recognition. Data-driven semantic classes achieve results with small differences compared to knowledge-based semantic classes. Our case study concluded that data-driven word classes can add important information and are complementary with knowledge-based semantic classes.
Language English
Issue date 2013-11-15
Collection   School/Department--School of Sciences and Engineering--Department of Computer Science--Post-graduate theses
  Type of Work--Post-graduate theses
Views 362

Digital Documents
No preview available

Download document
View document
Views : 6