Basically I am trying to incorporate speech recognition into an app that I am building. I want to be able to play a sound when the microphone button is pressed and then start recording and recognizing the audio. The problem is that when I press the button, no sound plays. Also when I run the app on my physical iPhone the sound slider in the control panel disappears. Can anybody help?
Here is my code:
class VoiceViewController: UIViewController, SFSpeechRecognizerDelegate, UITextViewDelegate {
private let speechRecognizer = SFSpeechRecognizer(locale: Locale(identifier: "en-US"))!
private var speechRecognitionRequest: SFSpeechAudioBufferRecognitionRequest?
private var speechRecognitionTask: SFSpeechRecognitionTask?
private let audioEngine = AVAudioEngine()
var audioPlayer: AVAudioPlayer = AVAudioPlayer()
var url: URL?
var recording: Bool = false
let myTextView = UITextView()
func startSession() throws {
if let recognitionTask = speechRecognitionTask {
recognitionTask.cancel()
self.speechRecognitionTask = nil
}
let audioSession = AVAudioSession.sharedInstance()
try audioSession.setCategory(AVAudioSessionCategoryRecord)
speechRecognitionRequest = SFSpeechAudioBufferRecognitionRequest()
guard let inputNode = audioEngine.inputNode else { fatalError("Audio engine has no input node") }
speechRecognitionRequest?.shouldReportPartialResults = true
speechRecognitionTask = speechRecognizer.recognitionTask(with: speechRecognitionRequest!) { result, error in
var finished = false
if let result = result {
print(result.bestTranscription.formattedString)
finished = result.isFinal
}
if error != nil || finished {
self.audioEngine.stop()
inputNode.removeTap(onBus: 0)
self.speechRecognitionRequest = nil
self.speechRecognitionTask = nil
}
}
let recordingFormat = inputNode.outputFormat(forBus: 0)
inputNode.installTap(onBus: 0, bufferSize: 1024, format: recordingFormat) { (buffer: AVAudioPCMBuffer, when: AVAudioTime) in
self.speechRecognitionRequest?.append(buffer)
}
audioEngine.prepare()
try audioEngine.start()
}
func stopTranscribing() {
if audioEngine.isRunning {
audioEngine.stop()
speechRecognitionRequest?.endAudio()
}
}
func btn_pressed() {
print("pressed")
if recording {
url = URL(fileURLWithPath: Bundle.main.path(forResource: "tweet", ofType: "mp3")!)
} else {
url = URL(fileURLWithPath: Bundle.main.path(forResource: "gesture", ofType: "mp3")!)
}
do {
try(audioPlayer = AVAudioPlayer(contentsOf: url!))
} catch let err {
print(err)
}
audioPlayer.play()
recording = (recording == false)
if recording == false {
stopTranscribing()
} else {
try! startSession()
}
}
override func viewDidLoad() {
super.viewDidLoad()
let button = UIButton()
button.setTitle("push me", for: UIControlState())
button.frame = CGRect(x: 10, y: 30, width: 80, height: 30)
button.addTarget(self, action: #selector(btn_pressed), for: .touchUpInside)
self.view.addSubview(button)
myTextView.frame = CGRect(x: 60, y: 100, width: 300, height: 200)
self.view.addSubview(myTextView)
// Do any additional setup after loading the view.
}
override func didReceiveMemoryWarning() {
super.didReceiveMemoryWarning()
// Dispose of any resources that can be recreated.
}
/*
// MARK: - Navigation
// In a storyboard-based application, you will often want to do a little preparation before navigation
override func prepare(for segue: UIStoryboardSegue, sender: AnyObject?) {
// Get the new view controller using segue.destinationViewController.
// Pass the selected object to the new view controller.
}
*/
}
The docs say:
AVAudioSessionCategoryRecord The category for recording audio; this category silences playback audio.
That is the only audio session category you are using, and you are setting it as soon as you start trying to play your AVAudioPlayer, so naturally you cannot hear anything. You need to think some more about being nimble and correct with your audio session. If you want to play a sound and then start recording, use a Playback category for playing the sound, and don't start recording until you are told (through the AVAudioPlayerDelegate) that the sound is finished.