Search code examples
phpamazon-polly

Generate .WAV from PCM (AWS Polly)


I've tried to find answers online, but couldn't find anything that helped me...

I am trying to convert a PCM stream into a WAV file using PHP (7.2) and save it on the server.

Specifically, I am generating speech via Amazon Polly with the below code:

try {
    $result = $client->synthesizeSpeech([
        'Text' => 'Dies ist ein Test.',
        'OutputFormat' => 'pcm',
        'SampleRate' => '8000',
        'VoiceId' => 'Hans'
    ]);

    $resultData = $result->get('AudioStream')->getContents();
}

I need a WAV file for use with different code later on.

Many thanks for your help!


Solution

  • You just need to add a header and append the PCM data. http://soundfile.sapp.org/doc/WaveFormat/

    I couldn't find any PHP library for this, so I wrote a simple PHP program to do so:

    <?php
    
    $pcm = file_get_contents('polly.raw');
    
    //$pcm = $result->get('AudioStream')->getContents();
    
    //Output file
    $fp = fopen('file.wav', 'wb');
    
    $pcm_size = strlen($pcm);
    
    $size = 36 + $pcm_size;
    
    $chunk_size = 16;
    
    $audio_format = 1;
    
    $channels = 1; //mono
    
    /**From the AWS Polly documentation: Valid values for pcm are "8000" and "16000" The default value is "16000".
     * https://docs.aws.amazon.com/polly/latest/dg/API_SynthesizeSpeech.html#polly-SynthesizeSpeech-request-OutputFormat
    **/
    $sample_rate = 16000; //Hz
    
    $bits_per_sample = 16;
    
    $block_align = $channels * $bits_per_sample / 8;
    
    $byte_rate = $sample_rate * $channels * $bits_per_sample / 8;
    
    /**
    * http://soundfile.sapp.org/doc/WaveFormat/
    * https://github.com/jwhu1024/pcm-to-wav/blob/master/inc/wave.h
    * https://jun711.github.io/aws/convert-aws-polly-synthesized-speech-from-pcm-to-wav-format/
    **/
    
    //RIFF chunk descriptor
    fwrite($fp, 'RIFF');
    
    fwrite($fp,pack('I', $size));
    fwrite($fp, 'WAVE');
    
    //fmt sub-chunk
    fwrite($fp, 'fmt ');
    
    fwrite($fp,pack('I', $chunk_size));
    fwrite($fp,pack('v', $audio_format));
    fwrite($fp,pack('v', $channels));
    fwrite($fp,pack('I', $sample_rate));
    fwrite($fp,pack('I', $byte_rate));
    fwrite($fp,pack('v', $block_align));
    fwrite($fp,pack('v', $bits_per_sample));
    
    //data sub-chunk
    
    fwrite($fp, 'data');
    fwrite($fp,pack('i', $pcm_size));
    fwrite($fp, $pcm);
    
    fclose($fp);
    
    

    You can use FFmpeg as well to achieve this, but my solution is purely written in PHP.

    I hope I could help you!