I am uploading a audio file recorded in Chrome to an S3 bucket that I am then attempting to play with a TwiML verb.
The error code I get is 12300 Failed to retrieve Play url, e.g.
https:// xxxx.s3.amazonaws.com/recordings/dev/prompt-26-87.mp3
With content type video/webm
I suspect the problem is the content type. video/webm is one of the Twilio accepted MIME types, but I suspect it is not accepted by the verb.
I can play the recording in the browser just fine, so there is nothing wrong with the URL or file upload.
I am uploading the file to S3 using ajax...
this.uploadAudio = function(fname) {
if(self.audioBlobs[fname]) {
let url = $('#upload-recording-url').val();
let fd = new FormData();
fd.append('fname', fname);
fd.append('data', self.audioBlobs[fname]);
$.ajax({
type: 'POST',
url: url,
data: fd,
processData: false,
contentType: 'audio/mpeg'
}).done(function(data) {
});
}
}
In the PHP server I am using the AWS Flysystem adaptor to write the uploaded audio file to AWS
public function uploadRecording(Request $request,
LoggerInterface $callcenterLogger
): Response
{
try {
$fname = $request->request->get('fname');
$data = $request->files->get('data');
$tempFile = $_FILES['data']['tmp_name'];
$blob = file_get_contents($tempFile);
$repo = $this->getFileRepositories()->getFileRepo('recordings');
$root = $repo->getAbsoluteWebRoot();
$env = $_ENV['APP_ENV'];
$path = $env.'/'.$fname;
$repoName = $repo->write($path, $blob);
return new JsonResponse(['success' => true, 'repo_name' => $repoName]);
} catch(\Exception $ex) {
return new JsonResponse(['success' => false, 'message' => $ex->getMessage()]);
}
}
However when I query the file with Postman, the content type always comes back as video/webm.
The recording is made in the browser with MediaRecorder
this.recordMicrophone = function(stream) {
self.mediaRecorder = new MediaRecorder(stream);
self.mediaRecorder.onstop = function(e) {
self.$audio.attr('controls',true);
let audioBlob = new Blob(self.chunks, { 'type' : 'audio/mpeg; codecs=mp3' });
self.chunks = [];
let audioURL = window.URL.createObjectURL(audioBlob);
self.$audio.attr('src', audioURL);
let recordingName = self.getRecordingName(self.$audio)
self.audioBlobs[recordingName] = audioBlob;
}
self.mediaRecorder.ondataavailable = function(e) {
self.chunks.push(e.data);
}
}
The TwiML file looks something like this...
<Response>
<Play>https://xxxxx.s3.amazonaws.com/recordings/dev/prompt-26-87.mp3</Play>
<Gather action="/anon/voice.xml/menu-option" input="dtmf speech" method="GET" numDigits="1">
</Gather>
</Response
I have tried a few other formats besides audio/mpeg: audio/ogg and audio/mp4 and codecs: opus and mp4. All with the same result: 12300 error in Twilio with content type of video/webm.
So I am wondering what is going on.
There are a few questions along this same line on SO. (e.g. Play audio file from S3, Audio is not playing for AWS s3 bucket video played through PHP Streaming file) but none of them seem to shed light on this problem.
UPDATE:
After a lot of fooling around I have discovered that the TwiML PLAY verb only plays .WAV and .MP3 files. The documentation is completely wrong on the supported formats, at least for the PLAY verb.
I tried having Chrome record an MP3 file and this didn't work, but I have discovered the MP3 file that is created in Chrome through MediaRecorder is not a true MP3 file but an MPEG container without the video track. The PLAY verb does not play this either, only true MP3 files.
So now I am working on getting Chrome to save a .WAV or a true .MP3 file. There is a bunch of example code floating around and I'll post an answer when I get something working.
Right.
So the problem turned out to have nothing to do with AWS S3 buckets. The fundamental problem is that Twilio PLAY is incapable of playing any recording made by Chrome.
I was fooled on both sides. First that Twilio could play advanced formats like audio/ogg. I thought this because audio/ogg and many, many other formats appear on the list of accepted media types: https://www.twilio.com/docs/sms/accepted-mime-types . Ha ha, these are the formats that are supported for programmable messaging. The PLAY verb is programmable voice and the formats documented there are much smaller and older. In fact the PLAY verb does not support any 21st century format!
I was fooled on the Chrome side because I thought was writing audio files in different formats. I set a format, Chrome creates a file. The files are varying lengths so I think I am creating audio files in different formats. Ha ha, the file lengths are different because the recordings I am making in Chrome are different. You can supply any format you want to Media Recorder and it says, "fine, let me just create a video/webm" file for you.
THE RIGHT TOOLS FOR THE JOB
I finally made headway on this frustrating problem when I installed the FFmpeg tool set (https://ffmpeg.org). Aha! I see now with ffprobe that the file I thought were MP3, WAV and so on are all video/webm. Pfffffft.
So now I am faced with the problem of converting Chrome's output to something the PLAY verb can use.
BEGIN RANT
This problem I have is clear and convincing evidence that the Twilio platform is simply a huge bag of incompatible technologies. Why does Programmable Messaging support modern audio formats and Programmable Voice does not? The obvious thing is that Programmable Messaging is newer than Programmable Voice and there is no common libraries on which they draw. This has the whiff of, nay the stench of, YAGNI. You build a one story building on a foundation that supports one story and then add 9 more stories on top of it. You design a truck without windshield wipers because it wasn't raining when you designed it. It turns out that Twilio is your duct tape and cable tie solution to telephony.
END RANT
So I had a couple of ways forward. I could gather raw audio data and write my own WAV file in the browser. That seemed wrong.
What I ended up doing was larding up my server with the FFmpeg tools coupled to PHP though PHP-FFMpeg (https://github.com/PHP-FFMpeg/PHP-FFMpeg). So now my upload recording looks like this...
public function uploadRecording(Request $request,
LoggerInterface $callcenterLogger
): Response
{
try {
$fname = $request->request->get('fname');
$data = $request->files->get('data');
$tempFile = $_FILES['data']['tmp_name'];
$convertedFilePath = tempnam(sys_get_temp_dir(), 'audio').'.mp3';
$this->convertAudio($tempFile, $convertedFilePath);
$blob = file_get_contents($convertedFilePath);
$repo = $this->getFileRepositories()->getFileRepo('recordings');
$root = $repo->getAbsoluteWebRoot();
$env = $_ENV['APP_ENV'];
$path = $env.'/'.$fname;
$repoName = $repo->write($path, $blob);
unlink($convertedFilePath);
return new JsonResponse(['success' => true, 'repo_name' => $repoName]);
} catch(\Exception $ex) {
$callcenterLogger->info('exception in uploadRecording, ex = '.$ex->__toString());
return new JsonResponse(['success' => false, 'message' => $ex->getMessage()]);
}
}
Using the new convert audio function...
use FFMpeg\FFMpeg;
use FFMpeg\Format\Audio\Mp3;
public function convertAudio(string $inputPath, string $outputFile)
{
$ffmpeg = null;
if(php_uname('s') == 'Linux') {
$ffmpeg = FFMpeg::create([
'ffmpeg.binaries' => 'bin/ffmpeg',
'ffprobe.binaries' => 'bin/ffprobe'
]);
} else {
$ffmpeg = FFMpeg::create();
}
$audio = $ffmpeg->open($inputPath);
$format = new MP3();
$audio->save($format, $outputFile);
}