License: Creative Commons Attribution 3.0 Unported license (CC BY 3.0)
When quoting this document, please refer to the following
DOI: 10.4230/OASIcs.SLATE.2019.4
URN: urn:nbn:de:0030-drops-108715
URL: http://dagstuhl.sunsite.rwth-aachen.de/volltexte/2019/10871/
Go to the corresponding OASIcs Volume Portal


Shulby, Christopher Dane ; Ferreira, Martha Dais ; de Mello, Rodrigo F. ; Aluisio, Sandra Maria

Robust Phoneme Recognition with Little Data

pdf-format:
OASIcs-SLATE-2019-4.pdf (0.4 MB)


Abstract

A common belief in the community is that deep learning requires large datasets to be effective. We show that with careful parameter selection, deep feature extraction can be applied even to small datasets.We also explore exactly how much data is necessary to guarantee learning by convergence analysis and calculating the shattering coefficient for the algorithms used. Another problem is that state-of-the-art results are rarely reproducible because they use proprietary datasets, pretrained networks and/or weight initializations from other larger networks. We present a two-fold novelty for this situation where a carefully designed CNN architecture, together with a knowledge-driven classifier achieves nearly state-of-the-art phoneme recognition results with absolutely no pretraining or external weight initialization. We also beat the best replication study of the state of the art with a 28% FER. More importantly, we are able to achieve transparent, reproducible frame-level accuracy and, additionally, perform a convergence analysis to show the generalization capacity of the model providing statistical evidence that our results are not obtained by chance. Furthermore, we show how algorithms with strong learning guarantees can not only benefit from raw data extraction but contribute with more robust results.

BibTeX - Entry

@InProceedings{shulby_et_al:OASIcs:2019:10871,
  author =	{Christopher Dane Shulby and Martha Dais Ferreira and Rodrigo F. de Mello and Sandra Maria Aluisio},
  title =	{{Robust Phoneme Recognition with Little Data}},
  booktitle =	{8th Symposium on Languages, Applications and Technologies (SLATE 2019)},
  pages =	{4:1--4:11},
  series =	{OpenAccess Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-114-6},
  ISSN =	{2190-6807},
  year =	{2019},
  volume =	{74},
  editor =	{Ricardo Rodrigues and Jan Janousek and Lu{\'\i}s Ferreira and Lu{\'\i}sa Coheur and Fernando Batista and Hugo Gon{\c{c}}alo Oliveira},
  publisher =	{Schloss Dagstuhl--Leibniz-Zentrum fuer Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{http://drops.dagstuhl.de/opus/volltexte/2019/10871},
  URN =		{urn:nbn:de:0030-drops-108715},
  doi =		{10.4230/OASIcs.SLATE.2019.4},
  annote =	{Keywords: feature extraction, acoustic modeling, phoneme recognition, statistical learning theory}
}

Keywords: feature extraction, acoustic modeling, phoneme recognition, statistical learning theory
Collection: 8th Symposium on Languages, Applications and Technologies (SLATE 2019)
Issue Date: 2019
Date of publication: 24.07.2019


DROPS-Home | Fulltext Search | Imprint | Privacy Published by LZI