# A Phase Vocoder in Python

In this note, I will show how the Phase Vocoder algorithm can be realised in Python, with the help of its very useful scientific libs. This little PV program does timestretching of an input.

First we import the required packages: sys, scipy, pylab and scipy.io. I am being quite liberal here, not using namespaces, but good practice tells us we should not do this. However, this simplifies the reading of the code:

import sys

from scipy import *

from pylab import *

from scipy.io import wavfile

Then we set our analysis parameters DFT size (N) and hopsize (H)

N = 2048

H = N/4

Take in an input soundfile name and a timescale factor from the command line:

# read input and get the timescale factor

(sr,signalin) = wavfile.read(sys.argv[2])

L = len(signalin)

tscale = float(sys.argv[1])

Set up our signal arrays to hold the processing output

# signal blocks for processing and output

phi = zeros(N)

out = zeros(N, dtype=complex)

sigout = zeros(L/tscale+N)

Find out what the peak amp of input (for scaling) and create a hanning window

# max input amp, window

amp = max(signalin)

win = hanning(N)

This is the processing loop. We’ll do the PV idea in a slightly different way from the example in the book. There, we created a spectral signal made up of amp,freq frames. Here we will not bother with this, we will just move along the input, calculating the PV parameters of two consecutive windows and then resynthesise these straight away. Timescale changes will happen if we move along the input at a different hopsize than H. The input will be overlap-added every H samples, which is also the hopsize basis of our PV analyses (the hop between the two consecutive analyses).

p = 0

pp = 0

while p < L-(N+H):# take the spectra of two consecutive windows

p1 = int(p)

spec1 = fft(win*signalin[p1:p1+N])

spec2 = fft(win*signalin[p1+H:p1+N+H])# take their phase difference and integrate

phi += (angle(spec2) – angle(spec1))# bring the phase back to between pi and -pi

while phi < -pi: phi += 2*pi

while phi >= pi: phi -= 2*pi

out.real, out.imag = cos(phi), sin(phi)# inverse FFT and overlap-add

sigout[pp:pp+N] += win*ifft(abs(spec2)*out)

pp += H

p += H*tscale

Then we just write the output and scale it to match the original amp.

# write file to output, scaling it to original amp

wavfile.write(sys.argv[3],sr,array(amp*sigout/max(sigout), dtype=’int16′))

we also attempt to play using sndfile-play if it is available:

# play it using a libsndfile utility

import os

try: os.spawnlp(os.P_WAIT, ‘sndfile-play’, ‘sndfile-play’, sys.argv[3])

except: pass

So, a slightly different way of doing things, demonstrating that there is always more than one way to skin a goat. Here is the full program (I hope the formatting does not get broken too much):

`# phase vocoder example`

# (c) V Lazzarini, 2010

# GNU Public License

# # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # #

import sys

from scipy import *

from pylab import *

from scipy.io import wavfile

N = 2048

H = N/4

# read input and get the timescale factor

(sr,signalin) = wavfile.read(sys.argv[2])

L = len(signalin)

tscale = float(sys.argv[1])

# signal blocks for processing and output

phi = zeros(N)

out = zeros(N, dtype=complex)

sigout = zeros(L/tscale+N)

# max input amp, window

amp = max(signalin)

win = hanning(N)

p = 0

pp = 0

`while p < L-(N+H):`

`# take the spectra of two consecutive windows`

p1 = int(p)

spec1 = fft(win*signalin[p1:p1+N])

spec2 = fft(win*signalin[p1+H:p1+N+H])

# take their phase difference and integrate

phi += (angle(spec2) - angle(spec1))

# bring the phase back to between pi and -pi

while phi < -pi: phi += 2*pi

while phi >= pi: phi -= 2*pi

out.real, out.imag = cos(phi), sin(phi)

# inverse FFT and overlap-add

sigout[pp:pp+N] += win*ifft(abs(spec2)*out)

pp += H

p += H*tscale

# write file to output, scaling it to original amp

`wavfile.write(sys.argv[3],sr,array(amp*sigout/max(sigout), dtype='int16'))`

# play it using a libsndfile utility

`import os`

try:

`os.spawnlp(os.P_WAIT, 'sndfile-play', 'sndfile-play', sys.argv[3])`

`except:`

`pass`

Thanks for the code.

I have a few doubts though.

Running this code, gave me an error regarding the ambiguity of the statements,

while phi = pi: phi -= 2*pi

I rectified this using phi.any()

But I also get a warning on the line:

ComplexWarning: Casting complex values to real discards the imaginary part

sigout[pp:pp+N] += win*ifft(abs(spec2)*out)

Is this the required implementation, or were the complex numbers supposed to stay complex?

I do not have any background in signal processing, so I would be grateful for your help

Actually, this must have changed in pylab, because it used to work in earlier versions. The correct fix is:

for i in phi:

while i > pi: i -= 2*pi

while i <= -pi: i += 2*pi

The warning is spurious, since the result should be real (imag = 0). But if it bothers you,

this will fix it:

sigout[pp:pp+N] += (win*ifft(abs(spec2)*out)).real

And is there a good way to check whether the program actually works or not, keeping timescale=1 ?

I am not sure what you mean? The code works.