License: Creative Commons Attribution 4.0 International license (CC BY 4.0)
When quoting this document, please refer to the following
DOI: 10.4230/OASIcs.SLATE.2023.2
URN: urn:nbn:de:0030-drops-185165
URL: http://dagstuhl.sunsite.rwth-aachen.de/volltexte/2023/18516/
Go to the corresponding OASIcs Volume Portal


Silva, Gabriel ; Rodrigues, Mário ; Teixeira, António ; Amorim, Marlene

A Framework for Fostering Easier Access to Enriched Textual Information

pdf-format:
OASIcs-SLATE-2023-2.pdf (1.0 MB)


Abstract

Considering the amount of information in unstructured data it is necessary to have suitable methods to extract information from it. Most of these methods have their own output making it difficult and costly to merge and share this information as there currently is no unified way of representing this information. While most of these methods rely on JSON or XML there has been a push to serialize these into RDF compliant formats due to their flexiblity and the existing ecosystem surrounding them.
In this paper we introduce a framework whose goal is to provide a serialization of enriched data into an RDF format, following FAIR principles, making it more interpretable, interoperable and shareable. We process a subset of the WikiNER dataset and showcase two examples of using this framework: One using CoNLL annotations and the other by performing entity-linking on an already existing graph. The results are a graph with every connection starting from the document and finishing on tokens while keeping the original text intact while embedding the enriched data into it, in this case the CoNLL annotations and Entities.

BibTeX - Entry

@InProceedings{silva_et_al:OASIcs.SLATE.2023.2,
  author =	{Silva, Gabriel and Rodrigues, M\'{a}rio and Teixeira, Ant\'{o}nio and Amorim, Marlene},
  title =	{{A Framework for Fostering Easier Access to Enriched Textual Information}},
  booktitle =	{12th Symposium on Languages, Applications and Technologies (SLATE 2023)},
  pages =	{2:1--2:14},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-291-4},
  ISSN =	{2190-6807},
  year =	{2023},
  volume =	{113},
  editor =	{Sim\~{o}es, Alberto and Ber\'{o}n, Mario Marcelo and Portela, Filipe},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/opus/volltexte/2023/18516},
  URN =		{urn:nbn:de:0030-drops-185165},
  doi =		{10.4230/OASIcs.SLATE.2023.2},
  annote =	{Keywords: Knowledge graphs, Enriched data, Natural language processing, Triplestore}
}

Keywords: Knowledge graphs, Enriched data, Natural language processing, Triplestore
Collection: 12th Symposium on Languages, Applications and Technologies (SLATE 2023)
Issue Date: 2023
Date of publication: 15.08.2023
Supplementary Material: Software (Dev Repository): https://github.com/gabrielrsilva11/GraphBuilderAPI


DROPS-Home | Fulltext Search | Imprint | Privacy Published by LZI