License: Creative Commons Attribution 3.0 Unported license (CC BY 3.0)
When quoting this document, please refer to the following
DOI: 10.4230/OASIcs.SLATE.2014.201
URN: urn:nbn:de:0030-drops-45702
URL: http://dagstuhl.sunsite.rwth-aachen.de/volltexte/2014/4570/
Go to the corresponding OASIcs Volume Portal


Leal, José Paulo ; Costa, Teresa

Multiscale Parameter Tuning of a Semantic Relatedness Algorithm

pdf-format:
18.pdf (0.5 MB)


Abstract

The research presented in this paper builds on previous work that lead to the definition of a family of semantic relatedness algorithms that compute a proximity given as input a pair of concept labels. The algorithms depends on a semantic graph, provided as RDF data, and on a particular set of weights assigned to the properties of RDF statements (types of arcs in the RDF graph). The current research objective is to automatically tune the weights for a given graph in order to increase the proximity quality. The quality of a semantic relatedness method is usually measured against a benchmark data set. The results produced by the method are compared with those on the benchmark using the Spearman's rank coefficient. This methodology works the other way round and uses this coefficient to tune the proximity weights. The tuning process is controlled by a genetic algorithm using the Spearman's rank coefficient as the fitness function. The genetic algorithm has its own set of parameters which also need to be tuned. Bootstrapping is based on a statistical method for generating samples that is used in this methodology to enable a large number of repetitions of the genetic algorithm, exploring the results of alternative parameter settings. This approach raises several technical challenges due to its computational complexity. This paper provides details on the techniques used to speedup this process. The proposed approach was validated with the WordNet 2.0 and the WordSim-353 data set. Several ranges of parameters values were tested and the obtained results are better than the state of the art methods for computing semantic relatedness using the WordNet 2.0, with the advantage of not requiring any domain knowledge of the ontological graph.

BibTeX - Entry

@InProceedings{leal_et_al:OASIcs:2014:4570,
  author =	{Jos{\'e} Paulo Leal and Teresa Costa},
  title =	{{Multiscale Parameter Tuning of a Semantic Relatedness Algorithm}},
  booktitle =	{3rd Symposium on Languages, Applications and Technologies},
  pages =	{201--213},
  series =	{OpenAccess Series in Informatics (OASIcs)},
  ISBN =	{978-3-939897-68-2},
  ISSN =	{2190-6807},
  year =	{2014},
  volume =	{38},
  editor =	{Maria Jo{\~a}o Varanda Pereira and Jos{\'e} Paulo Leal and Alberto Sim{\~o}es},
  publisher =	{Schloss Dagstuhl--Leibniz-Zentrum fuer Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{http://drops.dagstuhl.de/opus/volltexte/2014/4570},
  URN =		{urn:nbn:de:0030-drops-45702},
  doi =		{10.4230/OASIcs.SLATE.2014.201},
  annote =	{Keywords: semantic similarity, linked data, genetic algorithms, bootstrapping, WordNet}
}

Keywords: semantic similarity, linked data, genetic algorithms, bootstrapping, WordNet
Collection: 3rd Symposium on Languages, Applications and Technologies
Issue Date: 2014
Date of publication: 18.06.2014


DROPS-Home | Fulltext Search | Imprint | Privacy Published by LZI