License: Creative Commons Attribution 4.0 International license (CC BY 4.0)
When quoting this document, please refer to the following
DOI: 10.4230/OASIcs.LDK.2021.36
URN: urn:nbn:de:0030-drops-145728
URL: http://dagstuhl.sunsite.rwth-aachen.de/volltexte/2021/14572/
Nunnari, Fabrizio ;
EspaƱa-Bonet, Cristina ;
Avramidis, Eleftherios
A Data Augmentation Approach for Sign-Language-To-Text Translation In-The-Wild
Abstract
In this paper, we describe the current main approaches to sign language translation which use deep neural networks with videos as input and text as output. We highlight that, under our point of view, their main weakness is the lack of generalization in daily life contexts. Our goal is to build a state-of-the-art system for the automatic interpretation of sign language in unpredictable video framing conditions. Our main contribution is the shift from image features to landmark positions in order to diminish the size of the input data and facilitate the combination of data augmentation techniques for landmarks. We describe the set of hypotheses to build such a system and the list of experiments that will lead us to their verification.
BibTeX - Entry
@InProceedings{nunnari_et_al:OASIcs.LDK.2021.36,
author = {Nunnari, Fabrizio and Espa\~{n}a-Bonet, Cristina and Avramidis, Eleftherios},
title = {{A Data Augmentation Approach for Sign-Language-To-Text Translation In-The-Wild}},
booktitle = {3rd Conference on Language, Data and Knowledge (LDK 2021)},
pages = {36:1--36:8},
series = {Open Access Series in Informatics (OASIcs)},
ISBN = {978-3-95977-199-3},
ISSN = {2190-6807},
year = {2021},
volume = {93},
editor = {Gromann, Dagmar and S\'{e}rasset, Gilles and Declerck, Thierry and McCrae, John P. and Gracia, Jorge and Bosque-Gil, Julia and Bobillo, Fernando and Heinisch, Barbara},
publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
address = {Dagstuhl, Germany},
URL = {https://drops.dagstuhl.de/opus/volltexte/2021/14572},
URN = {urn:nbn:de:0030-drops-145728},
doi = {10.4230/OASIcs.LDK.2021.36},
annote = {Keywords: sing language, video recognition, end-to-end translation, data augmentation}
}
Keywords: |
|
sing language, video recognition, end-to-end translation, data augmentation |
Collection: |
|
3rd Conference on Language, Data and Knowledge (LDK 2021) |
Issue Date: |
|
2021 |
Date of publication: |
|
30.08.2021 |