Search code examples
javaamazon-web-servicestwilioaudio-streamingaudio-converter

Convert 8kHz mulaw to 16KHz PCM in real time


In my POC I'm receiving a conversation streaming from Twilio in 8kHz mulaw and I want to transcribe it using Amazon Transcribe that needs to get the audio in 16KHz and PCM.

I found here how to convert a file but failed to do this in streaming... The code for a file is:

File sourceFile = new File("<Source_Path>.wav");
File targetFile = new File("<Destination_Path>.wav");
AudioInputStream sourceAudioInputStream = AudioSystem.getAudioInputStream(sourceFile);

AudioInputStream targetAudioInputStream=AudioSystem.getAudioInputStream(AudioFormat.Encoding.PCM_SIGNED, sourceAudioInputStream);
System.out.println("Sample Rate1 "+targetAudioInputStream.getFormat().getFrameRate());
AudioFormat targetFormat = new AudioFormat(new AudioFormat.Encoding("PCM_SIGNED"), 16000, 16, 1, 2, 8000, false);

AudioInputStream targetAudioInputStream1 = AudioSystem.getAudioInputStream(targetFormat, targetAudioInputStream);
System.out.println("Sample Rate "+targetAudioInputStream1.getFormat().getFrameRate());

try {
    AudioSystem.write(targetAudioInputStream1, AudioFileFormat.Type.WAVE, targetFile);
} catch (IOException e) {
    e.printStackTrace();
}

Actually Twilio gives me a playload in Base64 (8KHz, mulaw) but I have to convert it to 16KHz, PCM.


Solution

  • You need a G.711 Decoder and Audio Resampler.

    Steps to be followed :

    1. use base64 decoder to decode the Payload received.

    2. use this payload buffer and decode using the G.711 decoder (mulaw to pcm)

    3. output of the G.711 decoder need to be given to the resampler for upsampling ( 8->16 KHz)

    Finally all the buffers are ready in PCM 16KHz.