License: Creative Commons Attribution 4.0 International license (CC BY 4.0)
When quoting this document, please refer to the following
DOI: 10.4230/OASIcs.LDK.2021.33
URN: urn:nbn:de:0030-drops-145691
URL: http://dagstuhl.sunsite.rwth-aachen.de/volltexte/2021/14569/
Ell, Basil ;
Elahi, Mohammad Fazleh ;
Cimiano, Philipp
Bridging the Gap Between Ontology and Lexicon via Class-Specific Association Rules Mined from a Loosely-Parallel Text-Data Corpus
Abstract
There is a well-known lexical gap between content expressed in the form of natural language (NL) texts and content stored in an RDF knowledge base (KB). For tasks such as Information Extraction (IE), this gap needs to be bridged from NL to KB, so that facts extracted from text can be represented in RDF and can then be added to an RDF KB. For tasks such as Natural Language Generation, this gap needs to be bridged from KB to NL, so that facts stored in an RDF KB can be verbalized and read by humans. In this paper we propose LexExMachina, a new methodology that induces correspondences between lexical elements and KB elements by mining class-specific association rules. As an example of such an association rule, consider the rule that predicts that if the text about a person contains the token "Greek", then this person has the relation nationality to the entity Greece. Another rule predicts that if the text about a settlement contains the token "Greek", then this settlement has the relation country to the entity Greece. Such a rule can help in question answering, as it maps an adjective to the relevant KB terms, and it can help in information extraction from text. We propose and empirically investigate a set of 20 types of class-specific association rules together with different interestingness measures to rank them. We apply our method on a loosely-parallel text-data corpus that consists of data from DBpedia and texts from Wikipedia, and evaluate and provide empirical evidence for the utility of the rules for Question Answering.
BibTeX - Entry
@InProceedings{ell_et_al:OASIcs.LDK.2021.33,
author = {Ell, Basil and Elahi, Mohammad Fazleh and Cimiano, Philipp},
title = {{Bridging the Gap Between Ontology and Lexicon via Class-Specific Association Rules Mined from a Loosely-Parallel Text-Data Corpus}},
booktitle = {3rd Conference on Language, Data and Knowledge (LDK 2021)},
pages = {33:1--33:21},
series = {Open Access Series in Informatics (OASIcs)},
ISBN = {978-3-95977-199-3},
ISSN = {2190-6807},
year = {2021},
volume = {93},
editor = {Gromann, Dagmar and S\'{e}rasset, Gilles and Declerck, Thierry and McCrae, John P. and Gracia, Jorge and Bosque-Gil, Julia and Bobillo, Fernando and Heinisch, Barbara},
publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
address = {Dagstuhl, Germany},
URL = {https://drops.dagstuhl.de/opus/volltexte/2021/14569},
URN = {urn:nbn:de:0030-drops-145691},
doi = {10.4230/OASIcs.LDK.2021.33},
annote = {Keywords: Ontology, Lexicon, Association Rules, Pattern Mining}
}
Keywords: |
|
Ontology, Lexicon, Association Rules, Pattern Mining |
Collection: |
|
3rd Conference on Language, Data and Knowledge (LDK 2021) |
Issue Date: |
|
2021 |
Date of publication: |
|
30.08.2021 |
Supplementary Material: |
|
Collection (Dataset and Source Code): http://www.LexExMachina.xyz |