License: Creative Commons Attribution 4.0 International license (CC BY 4.0)
When quoting this document, please refer to the following
DOI: 10.4230/OASIcs.SLATE.2023.2
URN: urn:nbn:de:0030-drops-185165
URL: http://dagstuhl.sunsite.rwth-aachen.de/volltexte/2023/18516/
Silva, Gabriel ;
Rodrigues, Mário ;
Teixeira, António ;
Amorim, Marlene
A Framework for Fostering Easier Access to Enriched Textual Information
Abstract
Considering the amount of information in unstructured data it is necessary to have suitable methods to extract information from it. Most of these methods have their own output making it difficult and costly to merge and share this information as there currently is no unified way of representing this information. While most of these methods rely on JSON or XML there has been a push to serialize these into RDF compliant formats due to their flexiblity and the existing ecosystem surrounding them.
In this paper we introduce a framework whose goal is to provide a serialization of enriched data into an RDF format, following FAIR principles, making it more interpretable, interoperable and shareable. We process a subset of the WikiNER dataset and showcase two examples of using this framework: One using CoNLL annotations and the other by performing entity-linking on an already existing graph. The results are a graph with every connection starting from the document and finishing on tokens while keeping the original text intact while embedding the enriched data into it, in this case the CoNLL annotations and Entities.
BibTeX - Entry
@InProceedings{silva_et_al:OASIcs.SLATE.2023.2,
author = {Silva, Gabriel and Rodrigues, M\'{a}rio and Teixeira, Ant\'{o}nio and Amorim, Marlene},
title = {{A Framework for Fostering Easier Access to Enriched Textual Information}},
booktitle = {12th Symposium on Languages, Applications and Technologies (SLATE 2023)},
pages = {2:1--2:14},
series = {Open Access Series in Informatics (OASIcs)},
ISBN = {978-3-95977-291-4},
ISSN = {2190-6807},
year = {2023},
volume = {113},
editor = {Sim\~{o}es, Alberto and Ber\'{o}n, Mario Marcelo and Portela, Filipe},
publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
address = {Dagstuhl, Germany},
URL = {https://drops.dagstuhl.de/opus/volltexte/2023/18516},
URN = {urn:nbn:de:0030-drops-185165},
doi = {10.4230/OASIcs.SLATE.2023.2},
annote = {Keywords: Knowledge graphs, Enriched data, Natural language processing, Triplestore}
}
Keywords: |
|
Knowledge graphs, Enriched data, Natural language processing, Triplestore |
Collection: |
|
12th Symposium on Languages, Applications and Technologies (SLATE 2023) |
Issue Date: |
|
2023 |
Date of publication: |
|
15.08.2023 |
Supplementary Material: |
|
Software (Dev Repository): https://github.com/gabrielrsilva11/GraphBuilderAPI |