Next: , Previous: Tour, Up: Tour


3.1 Command-line tools

In this section a few examples of how Marsyas can be used for various audio processing tasks are provided. Detailed documentation about the various command-line tools is provided in Chapter Available tools.

First we will explore audio playback, plugins, and real-time running audio classification in music/speech.

The following commands can be used to create two collections music.mf and speech.mf each one with 60 30-second audio clips of music and speech respectively (small modifications like changing the directory separator character are required for Windows).

     cd MY_MARSYAS_DIR/build/bin
     mkcollection -c music.mf ../../music_speech/music_wav
     mkcollection -c speech.mf ../../music_speech/speech_wav

The following commands can be used to have a quick preview of the two collections (the -l 1 arguments plays 1 second of audio from each 30-second clip). You can ctrl-c anytime to exit sfplay.

     sfplay -l 1 music.mf
     sfplay -l 1 speech.mf

Now we are ready to train a classifier that can be used for real-time music/speech discrimination. The following command extracts audio features, train a classifier and writes a text file ms.mpl describing the entire audio processing network that includes the trained classifier. The sfplugin executable loads this textual description and then processes any audio file classifying approximately every second of it into either music or speech.

     bextract music.mf speech.mf -cl GS -p ms.mpl
     sfplugin -p ms.mpl ../../audio/music_speech/music_wav/winds.wav
     sfplugin -p ms.mpl ../../audio/music_speech/speech_wav/allison.wav
     sfplugin -p ms.mpl ../../audio/music_speech/music_wav/gravity.wav

The next example shows how automatic genre classification with one feature-vector per file can be performed using Marsyas. Similarly we can create a labeled collection for the genres dataset.

     mkcollection -c cl.mf -l cl ../../audio/genres/classical
     mkcollection -c co.mf -l co ../../audio/genres/country
     mkcollection -c di.mf -l di ../../audio/genres/disco
     mkcollection -c hi.mf -l hi ../../audio/genres/hiphop
     mkcollection -c ja.mf -l ja ../../audio/genres/jazz
     mkcollection -c ro.mf -l ro ../../audio/genres/rock
     mkcollection -c bl.mf -l bl ../../audio/genres/blues
     mkcollection -c re.mf -l re ../../audio/genres/reggae
     mkcollection -c po.mf -l po ../../audio/genres/pop
     mkcollection -c me.mf -l me ../../audio/genres/metal
     cat cl.mf co.mf di.mf hi.mf ja.mf ro.mf bl.mf re.mf po.mf me.mf > genres10.mf

Extracting the features and getting statistics about the classification performance (accuracy, confusion matrix etc) can be done as follows (make sure the terminal size is wide enough to show the confusion matrix correctly):

     bextract -sv genres10.mf -w genres10.arff
     kea -w genres10.arff

Alternatively the generated .ARFF file can also be opened by the well-known Weka machine learning tool.

In addition to classic audio feature extraction and classification Marsyas can be used for a variety of other audio tasks.

     sfplay ../../audio/music_speech/music_wav/deedee.wav
     phasevocoder -q -ob -p 0.8 ../../audio/music_speech/music_wav/deedee.wav

The first command simply plays the file. The second one pitch shifts the audio by a factor of 0.8 without changing the duration using a phasevocoder. A more interactive exploration of phasevocoding is described in section User interfaces.

Finally efficient dominant melodic sound source extraction based on spectral clustering of sinusoidal components can be demonstrated as follows:

     sfplay ../../audio/music_speech/music_wav/nearhou.wav
     peakClustering ../../audio/music_speech/music_wav/nearhou.wav
     sfplay nearhouSep.wav