License: Creative Commons Attribution 4.0 International license (CC BY 4.0)
When quoting this document, please refer to the following
DOI: 10.4230/OASIcs.SLATE.2021.10
URN: urn:nbn:de:0030-drops-144277
URL: http://dagstuhl.sunsite.rwth-aachen.de/volltexte/2021/14427/
Go to the corresponding OASIcs Volume Portal


Simões, Alberto ; Gamallo, Pablo

LeMe-PT: A Medical Package Leaflet Corpus for Portuguese

pdf-format:
OASIcs-SLATE-2021-10.pdf (0.6 MB)


Abstract

The current trend on natural language processing is the use of machine learning. This is being done on every field, from summarization to machine translation. For these techniques to be applied, resources are needed, namely quality corpora. While there are large quantities of corpora for the Portuguese language, there is the lack of technical and focused corpora. Therefore, in this article we present a new corpus, built from drug package leaflets. We describe its structure and contents, and discuss possible exploration directions.

BibTeX - Entry

@InProceedings{simoes_et_al:OASIcs.SLATE.2021.10,
  author =	{Sim\~{o}es, Alberto and Gamallo, Pablo},
  title =	{{LeMe-PT: A Medical Package Leaflet Corpus for Portuguese}},
  booktitle =	{10th Symposium on Languages, Applications and Technologies (SLATE 2021)},
  pages =	{10:1--10:10},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-202-0},
  ISSN =	{2190-6807},
  year =	{2021},
  volume =	{94},
  editor =	{Queir\'{o}s, Ricardo and Pinto, M\'{a}rio and Sim\~{o}es, Alberto and Portela, Filipe and Pereira, Maria Jo\~{a}o},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/opus/volltexte/2021/14427},
  URN =		{urn:nbn:de:0030-drops-144277},
  doi =		{10.4230/OASIcs.SLATE.2021.10},
  annote =	{Keywords: drug corpora, information extractiom, word embeddings}
}

Keywords: drug corpora, information extractiom, word embeddings
Collection: 10th Symposium on Languages, Applications and Technologies (SLATE 2021)
Issue Date: 2021
Date of publication: 10.08.2021
Supplementary Material: Dataset: https://github.com/ambs/LeMe


DROPS-Home | Fulltext Search | Imprint | Privacy Published by LZI