Next: , Up: Collections and input files


4.1.1 Creating collections manually

A simple way to create a collection is the unix ls command. For example:

     ls /home/gtzan/data/sound/reggae/*.wav > reggae.mf

reggae.mf will look like this:

     /home/gtzan/data/sound/reggae/foo.wav
     /home/gtzan/data/sound/reggae/bar.wav

Any text editor can be used to create collection files. The only constraint is that the name of the collections file must have a .mf extension such as reggae.mf. In addition, any line starting with the # character is ignored. For Windows Visual Studio, change the slash character separating directories appropriately.

4.1.2 Labels

Labels may be added to collections by appending tab-seperated labels after each sound file:

     /home/gtzan/data/sound/reggae/foo.wav \t music
     /home/gtzan/data/sound/reggae/bar.wav \t speech

The \t represents an actual tab character. This allows you to create a “master” collection which includes different kinds of labelled sound files:

     cat music.mf speech.mf > all.mf

4.1.3 MARSYAS_DATADIR

Collections support the environment variable MARSYAS_DATADIR. This allows the use of .mf files shared between users (i.e. for a large dataset of audio). For example, the above collection could be rewritten as:

     MARSYAS_DATADIR/reggae/foo.wav \t music
     MARSYAS_DATADIR/reggae/bar.wav \t speech

provided that the user configures the environment variable appropriately. For example, using bash on Linux or MacOS X, users on three different machines may set up the variable as:

     export MARSYAS_DATADIR=/home/gtzan/data/sound/
     export MARSYAS_DATADIR=/Users/gtzan/data/sound/
     export MARSYAS_DATADIR=/home/gperciva/media/marsyas-data/