Search code examples
matlabneural-networkwavesox

get integer representation of .SPH audio files


I am trying to train a neural network using audio files that are originally in .SPH format. I need to get integers that represent the amplitude of the sound waves for neural net, so I used sox to convert the files to .wav format by calling sox infile.SPH outfile.wav remix 1-2 (remix for converting 2 channels into 1), and then tried to use [y, Fs, nbits, opts] = wavread('outfile.wav') in matlab to get the integer representation.

However, matlab threw Data compression format (CCITT mu-law) is not supported. So I used sox infile.SPH -b 16 -e signed-integer -c 1 outfile.wav which I think puts the wave file in a linear format instead of mu-law. But now matlab threw another error: Invalid Wave File. Reason: Cannot open file.

My audio files are in 8000 Hz u-law single or dual channels, and all in 8-bit, I think (8-bit for single for sure).

  1. Is there a way to get the integer representation out of the audio files using matlab or any other programs? Either u-law or linear is fine, unless one would be better for neural net training. Preferably 8 bit, since the source files are in 8-bit.

  2. I don't really understand .SPH. For the uncompressed ones (and ignore headers), are the files storing amplitudes (guess it has to somehow)? Can I extract numbers out of those files directly without bothering with waves? Are the signals stored in a sequential fashion such that it would make sense to split the audio files?

I am new to audio processing in general, so any pointers would be appreciated!


Solution

  • You need to clearly identify the main task: feeding the neural net with vectors or matrix. So the first step is to work on the audio file (without matlab!) in order to have wav files. The second step is the neural net setting/training with matlab.

    I would try to decompress 'sph' files, then convert them into 'wav' (for example see the instructions here and here).

    Finally, using sox in a command/terminal window is better than using it in the matlab console.