License: Creative Commons Attribution 3.0 Unported license (CC BY 3.0)
When quoting this document, please refer to the following
DOI: 10.4230/OASIcs.SLATE.2019.6
URN: urn:nbn:de:0030-drops-108735
URL: http://dagstuhl.sunsite.rwth-aachen.de/volltexte/2019/10873/
Go to the corresponding OASIcs Volume Portal


Simões, Alberto ; Gómez Guinovart, Xavier

Acquiring Domain-Specific Knowledge for WordNet from a Terminological Database

pdf-format:
OASIcs-SLATE-2019-6.pdf (0.4 MB)


Abstract

In this research we explore a terminological database (Termoteca) in order to expand the Portuguese and Galician wordnets (PULO and Galnet) with the addition of new synset variants (word forms for a concept), usage examples for the variants, and synset glosses or definitions.
The methodology applied in this experiment is based on the alignment between concepts of WordNet (synsets) and concepts described in Termoteca (terminological records), taking into account the lexical forms in both resources, their morphological category and their knowledge domains, using the information provided by the WordNet Domains Hierarchy and the Termoteca field domains to reduce the incidence of polysemy and homography in the results of the experiment.
The results obtained confirm our hypothesis that the combined use of the semantic domain information included in both resources makes it possible to minimise the problem of lexical ambiguity and to obtain a very acceptable index of precision in terminological information extraction tasks, attaining a precision above 89% when there are two or more different languages sharing at least one lexical form between the synset in Galnet and the Termoteca record.

BibTeX - Entry

@InProceedings{simes_et_al:OASIcs:2019:10873,
  author =	{Alberto Sim{\~o}es and Xavier G{\'o}mez Guinovart},
  title =	{{Acquiring Domain-Specific Knowledge for WordNet from a Terminological Database}},
  booktitle =	{8th Symposium on Languages, Applications and Technologies (SLATE 2019)},
  pages =	{6:1--6:13},
  series =	{OpenAccess Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-114-6},
  ISSN =	{2190-6807},
  year =	{2019},
  volume =	{74},
  editor =	{Ricardo Rodrigues and Jan Janousek and Lu{\'\i}s Ferreira and Lu{\'\i}sa Coheur and Fernando Batista and Hugo Gon{\c{c}}alo Oliveira},
  publisher =	{Schloss Dagstuhl--Leibniz-Zentrum fuer Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{http://drops.dagstuhl.de/opus/volltexte/2019/10873},
  URN =		{urn:nbn:de:0030-drops-108735},
  doi =		{10.4230/OASIcs.SLATE.2019.6},
  annote =	{Keywords: WordNet, Terminology, Lexical Resources, Natural Language Processing}
}

Keywords: WordNet, Terminology, Lexical Resources, Natural Language Processing
Collection: 8th Symposium on Languages, Applications and Technologies (SLATE 2019)
Issue Date: 2019
Date of publication: 24.07.2019


DROPS-Home | Fulltext Search | Imprint | Privacy Published by LZI