Search code examples
pythonwavspeech-to-textvoice-recording

wave.Error: unknown format: 3 arises when trying to convert a wav file into text in Python


I need to record an audio from the microphone and convert it into text. I have tried this conversion process using several audio clips that I downloaded from the web and it works fine. But when I try to convert the audio clip I recorded from the microphone it gives the following error.

Traceback (most recent call last): File "C:\Users\HP\AppData\Local\Programs\Python\Python37\lib\site-packages\speech_recognition__init__.py", line 203, in enter self.audio_reader = wave.open(self.filename_or_fileobject, "rb") File "C:\Users\HP\AppData\Local\Programs\Python\Python37\lib\wave.py", line 510, in open return Wave_read(f) File "C:\Users\HP\AppData\Local\Programs\Python\Python37\lib\wave.py", line 164, in init self.initfp(f) File "C:\Users\HP\AppData\Local\Programs\Python\Python37\lib\wave.py", line 144, in initfp self._read_fmt_chunk(chunk) File "C:\Users\HP\AppData\Local\Programs\Python\Python37\lib\wave.py", line 269, in _read_fmt_chunk raise Error('unknown format: %r' % (wFormatTag,)) wave.Error: unknown format: 3

The code I am trying is as follows.

import speech_recognition as sr
import sounddevice as sd
from scipy.io.wavfile import write

# recording from the microphone
fs = 44100  # Sample rate
seconds = 3  # Duration of recording

myrecording = sd.rec(int(seconds * fs), samplerate=fs, channels=2)
sd.wait()  # Wait until recording is finished
write('output.wav', fs, myrecording)  # Save as WAV file
sound = "output.wav"
recognizer = sr.Recognizer()

with sr.AudioFile(sound) as source:
     recognizer.adjust_for_ambient_noise(source)
     print("Converting audio file to text...")
     audio = recognizer.listen(source)

     try:
          text = recognizer.recognize_google(audio)
          print("The converted text:" + text)

     except Exception as e:
          print(e)

I looked at the similar questions that were answered, and they say that we need to convert it into a different wav format. Can someone provide me a code or a library that I can use for this conversion? Thank you in advance.


Solution

  • You wrote the file in float format:

    soxi output.wav 
    
    Input File     : 'output.wav'
    Channels       : 2
    Sample Rate    : 44100
    Precision      : 25-bit
    Duration       : 00:00:03.00 = 132300 samples = 225 CDDA sectors
    File Size      : 1.06M
    Bit Rate       : 2.82M
    Sample Encoding: 32-bit Floating Point PCM
    

    and wave module can't read it.

    To store int16 format do like this:

    import numpy as np
    myrecording = sd.rec(int(seconds * fs), samplerate=fs, channels=2)
    sd.wait()  # Wait until recording is finished
    write('output.wav', fs, myrecording.astype(np.int16))  # Save as WAV file in 16-bit format