Search code examples
pythonaudioraspberry-piaudio-recordingfrequency

Read frequency from mic on Raspberry Pi


Is there a simple way to record a few seconds of sound and convert it to frequency? I have a USB mic and a raspberry pi 2 B.

In the file posted (convert2note.py) I am wondering how to make f equal to frequency obtained from the mic. This is what the program looks like so far

#d=69+12*log(2)*(f/440)
#d is midi, f is frequency
import math
f=raw_input("Type the frequency to be converted to midi: ")
d=69+(12*math.log(float(f)/440))/(math.log(2))
d=round(int(d))
notes = ["C", "C#", "D", "D#", "E", "F", "F#", "G", "G#", "A", "A#", "B"]
print notes[d % len(notes)]

Thanks a ton in advance :D


Solution

  • For capturing audio, you could for example use the sox program. See the linked documentation for details, but it could be as simple as:

    rec input.wav
    

    But the following is used to make the file match the format expected by the code below;

    rec −c 2 −b 16 −e signed-integer -r 44100 input.wav
    

    (Technically only the -c, -b and -e options are necessary to match the code below. You could reduce the sample rate -r to speed up the processing)

    For processing the audio in Python, it would be best to save it in a wav file, since Python has a module for reading those in the standard library.

    For converting the audio to frequencies we'll use the discrete Fourier transform in the form of Numpy's fast Fourier transform for real input. See the code fragment below, where I'm also using matplotlib to make plots.

    The code below assumes a 2-channel (stereo) 16-bit WAV file.

    from __future__ import print_function, division
    import wave
    import numpy as np
    import matplotlib.pyplot as plt
    
    wr = wave.open('input.wav', 'r')
    sz = wr.getframerate()
    q = 5  # time window to analyze in seconds
    c = 12  # number of time windows to process
    sf = 1.5  # signal scale factor
    
    for num in range(c):
        print('Processing from {} to {} s'.format(num*q, (num+1)*q))
        avgf = np.zeros(int(sz/2+1))
        snd = np.array([])
        # The sound signal for q seconds is concatenated. The fft over that
        # period is averaged to average out noise.
        for j in range(q):
            da = np.fromstring(wr.readframes(sz), dtype=np.int16)
            left, right = da[0::2]*sf, da[1::2]*sf
            lf, rf = abs(np.fft.rfft(left)), abs(np.fft.rfft(right))
            snd = np.concatenate((snd, (left+right)/2))
            avgf += (lf+rf)/2
        avgf /= q
        # Plot both the signal and frequencies.
        plt.figure(1)
        a = plt.subplot(211)  # signal
        r = 2**16/2
        a.set_ylim([-r, r])
        a.set_xlabel('time [s]')
        a.set_ylabel('signal [-]')
        x = np.arange(44100*q)/44100
        plt.plot(x, snd)
        b = plt.subplot(212)  # frequencies
        b.set_xscale('log')
        b.set_xlabel('frequency [Hz]')
        b.set_ylabel('|amplitude|')
        plt.plot(abs(avgf))
        plt.savefig('simple{:02d}.png'.format(num))
        plt.clf()
    

    The avgf array now holds the average of the left and right frequencies. The plots look like this;

    sample graph

    As you can see, a sound signal generally holds many frequencies.