Skip to content

A Phase Vocoder in Python

March 2, 2012

In this note, I will show how the Phase Vocoder algorithm can be realised in Python, with the help of its very useful scientific libs. This little PV program does timestretching of an input.

First we import the required packages: sys, scipy, pylab and scipy.io. I am being quite liberal here, not using namespaces, but good practice tells us we should not do this. However, this simplifies the reading of the code:

import sys
from scipy import *
from pylab import *
from scipy.io import wavfile

Then we set our analysis parameters DFT size (N) and hopsize (H)

N = 2048
H = N/4

Take in an input soundfile name and a timescale factor from the command line:

# read input and get the timescale factor
(sr,signalin) = wavfile.read(sys.argv[2])
L = len(signalin)
tscale = float(sys.argv[1])

Set up our signal arrays to hold the processing output

# signal blocks for processing and output
phi  = zeros(N)
out = zeros(N, dtype=complex)
sigout = zeros(L/tscale+N)

Find out what the peak amp of input (for scaling) and create a hanning window

# max input amp, window
amp = max(signalin)
win = hanning(N)

This is the processing loop. We’ll do the PV idea in a slightly different way from the example in the book. There, we created a spectral signal made up of amp,freq frames. Here we will not bother with this, we will just move along the input, calculating the PV parameters of two consecutive windows and then resynthesise these straight away. Timescale changes will happen if we move along the input at a different hopsize than H. The input will be overlap-added every H samples, which is also the hopsize basis of our PV analyses (the hop between the two consecutive analyses).

p = 0
pp = 0
while p < L-(N+H):

# take the spectra of two consecutive windows
p1 = int(p)
spec1 =  fft(win*signalin[p1:p1+N])
spec2 =  fft(win*signalin[p1+H:p1+N+H])

# take their phase difference and integrate
phi += (angle(spec2) – angle(spec1))

# bring the phase back to between pi and -pi
while phi < -pi: phi += 2*pi
while phi >= pi: phi -= 2*pi
out.real, out.imag = cos(phi), sin(phi)

# inverse FFT and overlap-add
sigout[pp:pp+N] += win*ifft(abs(spec2)*out)
pp += H
p += H*tscale

Then we just write the output and scale it to match the original amp.

  #  write file to output, scaling it to original amp

wavfile.write(sys.argv[3],sr,array(amp*sigout/max(sigout), dtype=’int16′))

we also attempt to play using sndfile-play if it is available:

  #  play it using a libsndfile utility
import os
try: os.spawnlp(os.P_WAIT, ‘sndfile-play’, ‘sndfile-play’, sys.argv[3])
except: pass

So, a slightly different way of doing things, demonstrating that there is always more than one way to skin a goat. Here is the full program (I hope the formatting does not get broken too much):

# phase vocoder example
# (c) V Lazzarini, 2010
# GNU Public License
# # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # # #


import sys
from scipy import *
from pylab import *
from scipy.io import wavfile


N = 2048
H = N/4


# read input and get the timescale factor
(sr,signalin) = wavfile.read(sys.argv[2])
L = len(signalin)
tscale = float(sys.argv[1])
# signal blocks for processing and output
phi  = zeros(N)
out = zeros(N, dtype=complex)
sigout = zeros(L/tscale+N)

# max input amp, window
amp = max(signalin)
win = hanning(N)
p = 0
pp = 0

while p < L-(N+H):

# take the spectra of two consecutive windows
p1 = int(p)
spec1 =  fft(win*signalin[p1:p1+N])
spec2 =  fft(win*signalin[p1+H:p1+N+H])
# take their phase difference and integrate
phi += (angle(spec2) - angle(spec1))
# bring the phase back to between pi and -pi
while phi < -pi: phi += 2*pi
while phi >= pi: phi -= 2*pi
out.real, out.imag = cos(phi), sin(phi)
# inverse FFT and overlap-add
sigout[pp:pp+N] += win*ifft(abs(spec2)*out)
pp += H
p += H*tscale


#  write file to output, scaling it to original amp

wavfile.write(sys.argv[3],sr,array(amp*sigout/max(sigout), dtype='int16'))


#  play it using a libsndfile utility

import os
try:

os.spawnlp(os.P_WAIT, 'sndfile-play', 'sndfile-play', sys.argv[3])

except:

pass

Advertisements
4 Comments
  1. Thanks for the code.
    I have a few doubts though.
    Running this code, gave me an error regarding the ambiguity of the statements,
    while phi = pi: phi -= 2*pi
    I rectified this using phi.any()
    But I also get a warning on the line:
    ComplexWarning: Casting complex values to real discards the imaginary part
    sigout[pp:pp+N] += win*ifft(abs(spec2)*out)

    Is this the required implementation, or were the complex numbers supposed to stay complex?
    I do not have any background in signal processing, so I would be grateful for your help

    • Actually, this must have changed in pylab, because it used to work in earlier versions. The correct fix is:

      for i in phi:
      while i > pi: i -= 2*pi
      while i <= -pi: i += 2*pi

      The warning is spurious, since the result should be real (imag = 0). But if it bothers you,
      this will fix it:

      sigout[pp:pp+N] += (win*ifft(abs(spec2)*out)).real

  2. And is there a good way to check whether the program actually works or not, keeping timescale=1 ?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: