Search code examples
htmlqtstrip

Is there an easy way to strip HTML from a QString in Qt?


I have a QString with some HTML in it... is there an easy way to strip the HTML from it? I basically want just the actual text content.

<i>Test:</i><img src="blah.png" /><br> A test case

Would become:

Test: A test case

I'm curious to know if Qt has a string function or utility for this.


Solution

  • You may try to iterate through the string using QXmlStreamReader class and extract all text (if you HTML string is guarantied to be well formed XML).

    Something like this:

    QXmlStreamReader xml(htmlString);
    QString textString;
    while (!xml.atEnd()) {
        if ( xml.readNext() == QXmlStreamReader::Characters ) {
            textString += xml.text();
        }
    }
    

    but I'm unsure that its 100% valid ussage of QXmlStreamReader API since I've used it quite longe time ago and may forget something.