Search code examples
htmliosswiftparsinghtml-parsing

Parse text between <div> tags by HTMLKit Swift


I am trying to parse the HTML document, and get from HTML, URL, and Text, for this task I am using library HTMLKit, for URL I am using the next code:

func parseHTML() {
    browser.evaluateJavaScript("document.body.innerHTML") { (result, error) in
        guard let html = result as? String, error == nil else {
            print("Failed to get html string")
            return
        }
        
        let document = HTMLDocument(string: html)
        print("Create html doc")
        
        let urls: [String] = document.querySelectorAll("div").compactMap({ element in

            guard let src = element.attributes["href"] as? String else {
                return nil
            }
            

            return src
        })
        
        print("Found \(urls.count) urls \n")

    }
}

All work well, but I don't know how to parse text between

HTML code:

<div class="V7Sr0 p5AXld PpBGzd YcUVQe">What are the alternatives now that the Google web search API has been ...</div>

How I should modify the code if I want to get the text "What are the alternatives now that the Google web search API has been ..."


Solution

  • HTMLKit has property to get text between tag scopes - HTMLElement.textContent

    Or you can use regex w/o HTMLKit. For example - (?<=>)(.*)(?=<)