License: Creative Commons Attribution 3.0 Unported license (CC BY 3.0)
When quoting this document, please refer to the following
DOI: 10.4230/OASIcs.SLATE.2013.259
URN: urn:nbn:de:0030-drops-40420
URL: http://dagstuhl.sunsite.rwth-aachen.de/volltexte/2013/4042/
Go to the corresponding OASIcs Volume Portal


Medeiros, Henrique ; Batista, Fernando ; Moniz, Helena ; Trancoso, Isabel ; Nunes, Luis

Comparing Different Methods for Disfluency Structure Detection

pdf-format:
17.pdf (0.4 MB)


Abstract

This paper presents a number of experiments focusing on assessing
the performance of different machine learning methods on the identification of disfluencies and their distinct structural regions over speech data. Several machine learning methods have been applied, namely Naive Bayes, Logistic Regression, Classification and Regression Trees (CARTs), J48 and Multilayer Perceptron. Our experiments show that CARTs outperform the other methods on the identification of the distinct structural disfluent regions. Reported experiments are based on audio segmentation and prosodic features, calculated from a corpus of university lectures in European Portuguese, containing about 32h of speech and about 7.7% of disfluencies. The set of features automatically extracted from the forced alignment corpus proved to be discriminant of the regions contained in the production of a disfluency. This work shows that
using fully automatic prosodic features, disfluency structural regions
can be reliably identified using CARTs, where the best results achieved correspond to 81.5% precision, 27.6% recall, and 41.2% F-measure. The best results concern the detection of the interregnum, followed by the detection of the interruption point.

BibTeX - Entry

@InProceedings{medeiros_et_al:OASIcs:2013:4042,
  author =	{Henrique Medeiros and Fernando Batista and Helena Moniz and Isabel Trancoso and Luis Nunes},
  title =	{{Comparing Different Methods for Disfluency Structure Detection}},
  booktitle =	{2nd Symposium on Languages, Applications and Technologies},
  pages =	{259--269},
  series =	{OpenAccess Series in Informatics (OASIcs)},
  ISBN =	{978-3-939897-52-1},
  ISSN =	{2190-6807},
  year =	{2013},
  volume =	{29},
  editor =	{Jos{\'e} Paulo Leal and Ricardo Rocha and Alberto Sim{\~o}es},
  publisher =	{Schloss Dagstuhl--Leibniz-Zentrum fuer Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{http://drops.dagstuhl.de/opus/volltexte/2013/4042},
  URN =		{urn:nbn:de:0030-drops-40420},
  doi =		{10.4230/OASIcs.SLATE.2013.259},
  annote =	{Keywords: Machine learning, speech processing, prosodic features,   automatic detection of disfluencies}
}

Keywords: Machine learning, speech processing, prosodic features, automatic detection of disfluencies
Collection: 2nd Symposium on Languages, Applications and Technologies
Issue Date: 2013
Date of publication: 05.06.2013


DROPS-Home | Fulltext Search | Imprint | Privacy Published by LZI