Search code examples

convolving an audio signal

I am trying to write an audio fingerprinting library for educational purpose. Its based on Computer Vision for Music Identification . I have a couple of questions relating to the contents of the paper.

  1. I know that two bytes represents a sample, so i wrote this class to extract the samples from a pcm file. I'd like to know if this is right (sorry if its too obvious :) ).

    class FingerPrint:
       def __init__(self, pcmFile):
          self.pcmFile = pcmFile
          self.samples = []
       def init(self):
          # Current samples
          currentSamples = []
          # Read pcm file
          with open(self.pcmFile, 'rb') as f:
             byte =
             while byte != '':
               byte =
    fp = FingerPrint('output.pcm')
  2. If the above code is ok, then according to the book the next thing to do is to convolve the signal with a low pass filter and take every 8th sample. I don't understand these and why this has to be done, it would be awesome if someone could help me understand (with codes if possible)


  • After read the two bytes, you need to convert it into int. You can use struct module.

    But I think you should use NumPy, SciPy:

    To read wave file, you can call

    If your file is raw PCM data, you can call numpy.fromfile()

    for example:

    data = numpy.fromfile("test.pcm", dtype=np.int16)

    To design lowpass filter, you can use filter design functions in scipy.signal:

    To do the convolve, you can use convoliving functions in scipy.signal:

    There is also a convolve function in numpy: