Search code examples
androidjsoup

Parse json from head with jsoup


I am tring to get the json from this url for example "https://www.imdb.com/title/tt9598834/"

This is the code: suspend fun getRealRating(imdbCode: String): String = suspendCoroutine { cont ->

        val url = "https://www.imdb.com/title/tt9598834/"
        var document: Document? = null

        try {
            document = Jsoup.connect(url).get()
        } catch (e: IOException) {
            e.printStackTrace()
        }
        cont.resume("")
    }

I can see that

document.head().allElements[0]

contains

script type="application/ld+json"

with json movie data. How can I get this json as string?


Solution

  • You can do the following (it's Java not Kotlin, but it shouldn't be a big difference):

    Document doc = Jsoup.connect(url).get();
    // In this case you want the first script tag
    Element e = doc.select("script").first();
    String s = e.html();
    
    System.out.println(s);
    

    A part of the output I got:

    {"@context":"https://schema.org","@type":"Movie","url":"/title/tt9598834/","name":"The Xrossing","image":

    If you have more than one such elements, you can use -

    Elements el = doc.select("script[type=application/ld+json]");
    

    And then iterate over the result:

    for (Element e : el) {
            System.out.println(x.html());           
    }