License: Creative Commons Attribution 3.0 Unported license (CC BY 3.0)
When quoting this document, please refer to the following
DOI: 10.4230/DFU.Vol3.11041.175
URN: urn:nbn:de:0030-drops-34725
Go to the corresponding DFU Volume Portal

Müller, Meinard ; Driedger, Jonathan

Data-Driven Sound Track Generation

11.pdf (5 MB)


Background music is often used to generate a specific atmosphere or to draw our attention to specific events. For example in movies or computer games it is often the accompanying music that conveys the emotional state of a scene and plays an important role for immersing the viewer or player into the virtual environment. In view of home-made videos, slide shows, and other consumer-generated visual media streams, there is a need for computer-assisted tools that allow users to generate aesthetically appealing music tracks in an easy and intuitive way. In this contribution, we consider a data-driven scenario where the musical raw material is given in form of a database containing a variety of audio recordings. Then, for a given visual media stream, the task consists in identifying, manipulating, overlaying, concatenating, and blending suitable music clips to generate a music stream that satisfies certain constraints imposed by the visual data stream and by user specifications. It is our main goal to give an overview of various content-based music processing and retrieval techniques that become important in data-driven sound track generation. In particular, we sketch a general pipeline that highlights how the various techniques act together and come into play when generating musically plausible transitions between subsequent music clips.

BibTeX - Entry

  author =	{Meinard M{\"u}ller and Jonathan Driedger},
  title =	{{Data-Driven Sound Track Generation}},
  booktitle =	{Multimodal Music Processing},
  pages =	{175--194},
  series =	{Dagstuhl Follow-Ups},
  ISBN =	{978-3-939897-37-8},
  ISSN =	{1868-8977},
  year =	{2012},
  volume =	{3},
  editor =	{Meinard M{\"u}ller and Masataka Goto and Markus Schedl},
  publisher =	{Schloss Dagstuhl--Leibniz-Zentrum fuer Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{},
  URN =		{urn:nbn:de:0030-drops-34725},
  doi =		{10.4230/DFU.Vol3.11041.175},
  annote =	{Keywords: Sound track, content-based retrieval, audio matching, time-scale modification, warping, tempo, beat tracking, harmony}

Keywords: Sound track, content-based retrieval, audio matching, time-scale modification, warping, tempo, beat tracking, harmony
Collection: Multimodal Music Processing
Issue Date: 2012
Date of publication: 27.04.2012

DROPS-Home | Fulltext Search | Imprint | Privacy Published by LZI