Search code examples
iosiphoneswiftios10sfspeechrecognizer

SFSpeechRecognizer (Siri Transcription) Timeout Error on iOS App


In my iOS app, I am trying to transcribe prerecorded audio using iOS 10's latest feature, the Speech API.

Multiple sources including the documentation have stated that the audio duration limit for the Speech API (more specifically SFSpeechRecognizer) is 1 minute.

In my code, I have found that any audio files with a length of about 15 seconds or more, will get the following error.

Error Domain=kAFAssistantErrorDomain Code=203 "SessionId=com.siri.cortex.ace.speech.session.event.SpeechSessionId@50a8e246, Message=Timeout waiting for command after 30000 ms" UserInfo={NSLocalizedDescription=SessionId=com.siri.cortex.ace.speech.session.event.SpeechSessionId@50a8e246, Message=Timeout waiting for command after 30000 ms, NSUnderlyingError=0x170248c40 {Error Domain=SiriSpeechErrorDomain Code=100 "(null)"}}

I have searched all over the internet and have not been able to find a solution to this. There also have been people with the same problem. Some people suspect that it's a problem with Nuance.

It is also worth noting that I do get partial results from the transcription process.

Here's the code from my iOS app. ` // Create a speech recognizer request object. let srRequest = SFSpeechURLRecognitionRequest(url: location) srRequest.shouldReportPartialResults = false

    sr?.recognitionTask(with: srRequest) { (result, error) in
        if let error = error {
            // Something wrong happened
            print(error.localizedDescription)
        } else {
            if let result = result {
                print(4)
                print(result.bestTranscription.formattedString)
                if result.isFinal {
                    print(5)
                    transcript = result.bestTranscription.formattedString
                    print(result.bestTranscription.formattedString)

                    // Store the transcript into the database.
                    print("\nSiri-Transcript: " + transcript!)

                    // Store the audio transcript into Firebase Realtime Database
                    self.firebaseRef = FIRDatabase.database().reference()

                    let ud = UserDefaults.standard
                    if let uid = ud.string(forKey: "uid") {
                        print("Storing the transcript into the database.")
                        let path = "users" + "/" + uid + "/" + "siri_transcripts" + "/" + date_recorded + "/" + filename.components(separatedBy: ".")[0]
                        print("transcript database path: \(path)")
                        self.firebaseRef.child(path).setValue(transcript)
                    }
                }
            }
        }
    }`

Thank you for your help.


Solution

  • I haven't confirmed my answer aside from someone else running into the same problem but I believe it is an undocumented limit on prerecorded audio.