Next: User interfaces, Previous: Tour, Up: Tour
In this section a few examples of how Marsyas can be used for various audio processing tasks are provided. Detailed documentation about the various command-line tools is provided in Chapter Available tools.
First we will explore audio playback, plugins, and real-time running audio classification in music/speech.
The following commands can be used to create two collections music.mf and speech.mf each one with 60 30-second audio clips of music and speech respectively (small modifications like changing the directory separator character are required for Windows).
cd MY_MARSYAS_DIR/build/bin
mkcollection -c music.mf ../../music_speech/music_wav
mkcollection -c speech.mf ../../music_speech/speech_wav
The following commands can be used to have a quick preview of the two collections (the -l 1 arguments plays 1 second of audio from each 30-second clip). You can ctrl-c anytime to exit sfplay.
sfplay -l 1 music.mf
sfplay -l 1 speech.mf
Now we are ready to train a classifier that can be used for real-time music/speech discrimination. The following command extracts audio features, train a classifier and writes a text file ms.mpl describing the entire audio processing network that includes the trained classifier. The sfplugin executable loads this textual description and then processes any audio file classifying approximately every second of it into either music or speech.
bextract music.mf speech.mf -cl GS -p ms.mpl
sfplugin -p ms.mpl ../../audio/music_speech/music_wav/winds.wav
sfplugin -p ms.mpl ../../audio/music_speech/speech_wav/allison.wav
sfplugin -p ms.mpl ../../audio/music_speech/music_wav/gravity.wav
The next example shows how automatic genre classification with one feature-vector per file can be performed using Marsyas. Similarly we can create a labeled collection for the genres dataset.
mkcollection -c cl.mf -l cl ../../audio/genres/classical
mkcollection -c co.mf -l co ../../audio/genres/country
mkcollection -c di.mf -l di ../../audio/genres/disco
mkcollection -c hi.mf -l hi ../../audio/genres/hiphop
mkcollection -c ja.mf -l ja ../../audio/genres/jazz
mkcollection -c ro.mf -l ro ../../audio/genres/rock
mkcollection -c bl.mf -l bl ../../audio/genres/blues
mkcollection -c re.mf -l re ../../audio/genres/reggae
mkcollection -c po.mf -l po ../../audio/genres/pop
mkcollection -c me.mf -l me ../../audio/genres/metal
cat cl.mf co.mf di.mf hi.mf ja.mf ro.mf bl.mf re.mf po.mf me.mf > genres10.mf
Extracting the features and getting statistics about the classification performance (accuracy, confusion matrix etc) can be done as follows (make sure the terminal size is wide enough to show the confusion matrix correctly):
bextract -sv genres10.mf -w genres10.arff
kea -w genres10.arff
Alternatively the generated .ARFF file can also be opened by the well-known Weka machine learning tool.
In addition to classic audio feature extraction and classification Marsyas can be used for a variety of other audio tasks.
sfplay ../../audio/music_speech/music_wav/deedee.wav
phasevocoder -q -ob -p 0.8 ../../audio/music_speech/music_wav/deedee.wav
The first command simply plays the file. The second one pitch shifts the audio by a factor of 0.8 without changing the duration using a phasevocoder. A more interactive exploration of phasevocoding is described in section User interfaces.
Finally efficient dominant melodic sound source extraction based on spectral clustering of sinusoidal components can be demonstrated as follows:
sfplay ../../audio/music_speech/music_wav/nearhou.wav
peakClustering ../../audio/music_speech/music_wav/nearhou.wav
sfplay nearhouSep.wav