Search code examples
cwavbreakgaps-in-data

C:Split wav file by silence gap


I have a bunch human reading simple sentence (hello world) as a wav file, How can I break the wav file for 2 wav files each contains word (hello and world) by automatically recognizing the gap between the words? Unfortunately I was unable to find tool to do it for me, so I will write C code that do that, As for my understanging, the gaps should be low numeric values in the wav file, is that correct? I know how to break the files, I Will glad to get approach for the gap recognition problem. Thank you!


Solution

  • The way I approach this kind of task is by breaking the wav file into blocks of, say, 0.05 seconds each, computing the RMS amplitude of each block, and comparing the RMS amp to a threshold. If the recording is done under carefully controlled conditions, and the volume of speech relatively well normalized, the threshold may be a static value, but another way to do it is dynamically, checking for a block that is substantially louder than the previous block. You then consider the over-threshold block to be the start of a word.

    However, in casual speech, there may not be much of a pause between words. If I say "helloworld" to you without a pause, you can understand me easily.

    RMS amplitude is defined as the square root of the average-over-time of the squares of the individual samples.