I am trying to implement speech recognition using the Azure Speech SDK in iOS project using Swift and I ran into the problem that the speech recognition completion function (stopContinuousRecognition()
) blocks the app UI for a few seconds, but there is no memory or processor load or leak. I tried to move this function to DispatchQueue.main.async {}
, but it gave no results. Maybe someone faced such a problem? Is it necessary to put this in a separate thread and why does the function take so long to finish?
Edit: It is very hard to provide working example, but basically I am calling this function on button press:
private func startListenAzureRecognition(lang:String) {
let audioFormat = SPXAudioStreamFormat.init(usingPCMWithSampleRate: 8000, bitsPerSample: 16, channels: 1)
azurePushAudioStream = SPXPushAudioInputStream(audioFormat: audioFormat!)
let audioConfig = SPXAudioConfiguration(streamInput: azurePushAudioStream!)!
var speechConfig: SPXSpeechConfiguration?
do {
let sub = "enter your code here"
let region = "enter you region here"
try speechConfig = SPXSpeechConfiguration(subscription: sub, region: region)
speechConfig!.enableDictation();
speechConfig?.speechRecognitionLanguage = lang
} catch {
print("error \(error) happened")
speechConfig = nil
}
self.azureRecognition = try! SPXSpeechRecognizer(speechConfiguration: speechConfig!, audioConfiguration: audioConfig)
self.azureRecognition!.addRecognizingEventHandler() {reco, evt in
if (evt.result.text != nil && evt.result.text != "") {
print(evt.result.text ?? "no result")
}
}
self.azureRecognition!.addRecognizedEventHandler() {reco, evt in
if (evt.result.text != nil && evt.result.text != "") {
print(evt.result.text ?? "no result")
}
}
do {
try! self.azureRecognition?.startContinuousRecognition()
} catch {
print("error \(error) happened")
}
}
And when I press the button again to stop recognition, I am calling this function:
private func stopListenAzureRecognition(){
DispatchQueue.main.async {
print("start")
// app blocks here
try! self.azureRecognition?.stopContinuousRecognition()
self.azurePushAudioStream!.close()
self.azureRecognition = nil
self.azurePushAudioStream = nil
print("stop")
}
}
Also I am using raw audio data from mic (recognizeOnce
works perfectly for first phrase, so everything is fine with audio data)
Try closing the stream first and then stopping the continuous recognition:
azurePushAudioStream!.close()
try! azureRecognition?.stopContinuousRecognition()
azureRecognition = nil
azurePushAudioStream = nil
You don't even need to do it asynchronously.
At least this worked for me.