ios swift pdf pdfkit avspeechsynthesizer

How can I get a selected word in PDF so that I can have the word pronounced? [Swift, PDFKit]

I am creating a PDF viewer for iOS in Swift. With the app, I want to create a function where learners can tap each word in the PDF to hear the pronunciation of each word.

I managed to present a PDF file and pronounce a text. However, I am struggling to connect them together.

I would appreciate it if you could show me how I can modify the following code in the way that when a user taps a word, the word is pronounced by using a Text-to-speech program.

Thank you very much!

ViewController.swift

import UIKit
import PDFKit
import AVFoundation

class ViewController: UIViewController {
    @IBOutlet weak var pdfView: PDFView!

    override func viewDidLoad() {
        super.viewDidLoad()
        if let url = Bundle.main.url(forResource: "pdf", withExtension: "pdf"){
            if let pdfDocument = PDFDocument(url:url){
                pdfView.autoScales = true
                pdfView.document = pdfDocument
                } 
    
let wordTouched = "test"
        //playing the word 
                    let utterance = AVSpeechUtterance(string: wordTouched)
                    utterance.voice = AVSpeechSynthesisVoice(language: "en-US")

                    let synth = AVSpeechSynthesizer()
                    synth.speak(utterance)

I am aware that there are several discussions as to how to select all text in a PDF (How can I get all text from a PDF in Swift?). I would like to get a single word that a user tapped, and send it to AVSpeechSynthesizer().

Solution

Add UITapGestureRecognizer to pdfView:

let tapgesture = UITapGestureRecognizer(target: self, action: #selector(tapGesture(_:)))
pdfView.addGestureRecognizer(tapgesture)

Handle tap gesture:

@objc func tapGesture(_ gestureRecognizer: UITapGestureRecognizer) {
    let point = gestureRecognizer.location(in: pdfView)

    if let page = pdfView.page(for: point, nearest: false) {
        //convert point from pdfView coordinate system to page coordinate system
        let convertedPoint = pdfView.convert(point, to: page)

        //ensure that there is no link/url at this point
        if page.annotation(at: convertedPoint) == nil {
            //get word at this point
            if let selection = page.selectionForWord(at: convertedPoint) {
                if let wordTouched = selection.string {
                    //pronounce word
                    let utterance = AVSpeechUtterance(string: wordTouched)
                    utterance.voice = AVSpeechSynthesisVoice(language: "en-US")

                    let synth = AVSpeechSynthesizer()
                    synth.speak(utterance)

                    //if you also want to show selection of this word for one second
                    pdfView.currentSelection = selection
                    DispatchQueue.main.asyncAfter(deadline: .now() + 1) {
                        self.pdfView.clearSelection()
                    }
                }
            }
        }
    }
}