Search code examples
javahtmldomtagsjsoup

How to remove a specific tag from the entire html page using jsoup


i'm using jsoup 1.7.3 to edit some html files.

what i need is to remove the following tags from the html file :

<meta name="GENERATOR" content="XXXXXXXXXXXXXX">
<meta name="CREATED" content="0;0">
<meta name="CHANGED" content="0;0">

As you see its the tag, how can i do that, here what i've tried so far :

//im pretty sure that the <meta> tag is nested in the <header>
but removing the whole  header is bad practice.

Document docsoup = Jsoup.parse(htmlin);
docsoup.head().remove();

what do you suggest ?


Solution

  • I recommend you use Jsoup selectors, for example

    Document document = Jsoup.parse(html);
    Elements selector = document.select("meta[name=GENERATOR]");
    
    for (Element element : selector) {
        element.remove();
    }
    
    doc.html(); // returns String html with elements removed