License: Creative Commons Attribution-NonCommercial-NoDerivs 3.0 Unported license (CC BY-NC-ND 3.0)
When quoting this document, please refer to the following
DOI: 10.4230/OASIcs.SLATE.2012.267
URN: urn:nbn:de:0030-drops-35285
URL: http://dagstuhl.sunsite.rwth-aachen.de/volltexte/2012/3528/
Laki, László
Investigating the Possibilities of Using SMT for Text Annotation
Abstract
In this paper I examine the applicability of SMT methodology for part-of-speech disambiguation and lemmatization in Hungarian. After the baseline system was created, different methods and possibilities were used to improve the efficiency of the system. I also applied some methods to decrease the size of the target dictionary and to find a proper solution to handle out-of-vocabulary words. The results show that such a light-weight system performs comparable results to other state-of-the-art systems.
BibTeX - Entry
@InProceedings{laki:OASIcs:2012:3528,
author = {L{\'a}szl{\'o} Laki},
title = {{Investigating the Possibilities of Using SMT for Text Annotation}},
booktitle = {1st Symposium on Languages, Applications and Technologies},
pages = {267--283},
series = {OpenAccess Series in Informatics (OASIcs)},
ISBN = {978-3-939897-40-8},
ISSN = {2190-6807},
year = {2012},
volume = {21},
editor = {Alberto Sim{\~o}es and Ricardo Queir{\'o}s and Daniela da Cruz},
publisher = {Schloss Dagstuhl--Leibniz-Zentrum fuer Informatik},
address = {Dagstuhl, Germany},
URL = {http://drops.dagstuhl.de/opus/volltexte/2012/3528},
URN = {urn:nbn:de:0030-drops-35285},
doi = {10.4230/OASIcs.SLATE.2012.267},
annote = {Keywords: SMT, POS-tagging, Lemmatization, Target language set, OOV}
}
Keywords: |
|
SMT, POS-tagging, Lemmatization, Target language set, OOV |
Collection: |
|
1st Symposium on Languages, Applications and Technologies |
Issue Date: |
|
2012 |
Date of publication: |
|
21.06.2012 |