Search code examples
javaandroidjsoup

How to get actual source code without compromising case and line break?


I am using jsoup to get source code. I am using jsoup version 1.13.1. when I get the source code using below code I found that the case is converted to lowercase.

Document doc = Jsoup.connect("https://example.com").get();
webview.loadData(doc);

I saw several answer where they prefer xml parser. But I don't know how to use xml parser to parse html from a url. And there is also base url that I don't understand. I am working with an Android app project. So any answer will be helpful for me. Thanks in advance


Solution

  • It's easy to use a different parser than the default - either the XML parser (which preserves case and disables pretty-printing (i.e. preserves line breaks)), or the HTML parser configured similarly. Just use the Connection#parser() method:

    Document document = Jsoup.connect("https://example.com")
        .parser(Parser.xmlParser())
        .get();
    
    Document document = Jsoup.connect("https://example.com")
        .parser(Parser.htmlParser().settings(ParseSettings.preserveCase))
        .get();
    document.outputSettings().prettyPrint(false);