Previous: Synthesis, Up: Synthesis

4.4.1 phasevocoder

phasevocoder is probably the most powerful and canonical example of sound synthesis provided currently by Marsyas. It is based on the phasevocoder implementation described by F.R.Moore in his book “Elements of Computer Music”. It is broken into individual MarSystems in a modular way and can be used for sound-file and real-time input pitch-shifting and/or time-scaling. Several variations of the algorithm proposed in the literature have been implemented and can be configured through several command-line options. Familiarity with phasevocoder terminology will help understanding their effect on the transformed sound file. Some representative examples are:

     phasevocoder foo.wav -f foo_identity.wav
     phasevocoder foo.wav -f foo_stretched.wav -n 2048 -w 2048 -d 256 -i 512
     phasevocoder foo.wav -ob -cm sorted -s 10 -p 1.5 -f foo_pitch_shifted.wav
     phasevocoder foo.wav -f foo_stretched.wav -n 4096 -w 4096 -d 768 -i 1024
     -cm full -ucm identity_phaselock
     phasevocoder foo.wav -f foo_stretched.wav -n 4096 -w 4096 -d 768 -i 1024
     -cm analysis_scaled_phaselock -ucm scaled_phaselock

In the first example the input file foo.wav is passed through the classic phasevocoder (overlap-add, FFT-frontend and FFT-backend) without any time or pitch modifications. The second example show how time stretching can be achieved by making the analysis hop size (-d) and the synthesis hop size (-i) different. The -n option specified the FFT size and the -w option specifies the window size. In the third example a bank of sinusoidal oscillators (-ob) is used instead of the FFT-backend and the input is pitch shifted by 1.5. The fourth example uses identity phaselocking (-ucm) and the fifth example uses scaled phaselocking (-cm and -ucm) as described by Laroche and Dobson.

-n --fftsize
size of the fft
-w --winsize
size of the window
-v --voices
number of voices
-g --gain
linear volume gain
-b --bufferSize
audio buffer size
-m --midi
midi input port number
-e --epochHeterophonics
heterophonics epoch
-d --decimation
analysis hop size (decimation)
-i --interpolation
synthesis hop size (interpolation)
-p --pitchshift
pitch shift factor (for example 2.0 is an octave)
-ob --oscbank
use bank of oscillators back-end
-s --sinusoids
number of sinusoids to use if convert mode is sorted
-cm --convertmode
analysis front-end mode: full: use all FFT bins, sorted: sort FFT bins by magnitude and only use s sinusoids, analysis_scaled_phaselock: compute extra analysis info for scaled phaselocking
-ucm --unconvertmode
synthesis back-end mode: classic: propagate phases for all bins, loose_phaselock: described by Puckette, identity_phaselock: pick peaks, propagate phases for peaks and lock regions of influence around them, scaled_phaselock: refinement that takes into account information from the previous frame
-on --onsets filename_with_onsets
takes as input a simple text file with locations of onsets that are used to re-initialize phases and not time stretch transient frames that contain the onsets.