I have a sample code as below.
String sample = "<html>
<head>
</head>
<body>
This is a sample on parsing HTML body using jsoup
This is a sample on parsing HTML body using jsoup
</body>
</html>";
Document doc = Jsoup.parse(sample);
String output = doc.body().text();
I get the output as
This is a sample on parsing HTML body using jsoup This is a sample on `parsing HTML body using jsoup`
But I want the output as
This is a sample on parsing HTML body using jsoup
This is a sample on parsing HTML body using jsoup
How do parse it so that I get this output? Or is there another way to do so in Java?
You can disable the pretty printing of your document to get the output like you want it. But you also have to change the .text()
to .html()
.
Document doc = Jsoup.parse(sample);
doc.outputSettings(new Document.OutputSettings().prettyPrint(false));
String output = doc.body().html();