Search code examples
htmlregexqtextractqregexp

Qt Regexp extract <p> tags from Html string


I have a RichText and I store its Html source from the QTextEdit in a string. What I'd like to do is extract all the lines one-by-one (I have 4-6 lines). The string looks like this:

//html opening stuff
<p style = attributes...><span style = attributes...>My Text</span></p>
//more lines like this
//html closing stuff

So I need the WHOLE LINES from the opening p tag to the closing p tag (including the p tags too). I checked and tried everything I found around here and on other sites, but still no result.

Here's my code ("htmlStyle" is the input string):

QStringList list;
QRegExp rx("(<p[^>]*>.*?</p>)");
int pos = 0;

while ((pos = rx.indexIn(htmlStyle, pos)) != -1) {
    list << rx.cap(1);
    pos += rx.matchedLength();
}

Or is there any other way to do this without regex?


Solution

  • below is pure java way, hope this helps:

    int startIndex = htmlStyle.indexOf("<p>");
            int endIndex = htmlStyle.indexOf("</p>");
            while (startIndex >= 0) {
                endIndex = endIndex + 4;// to include </p> in the substring
                System.out.println(htmlStyle.substring(startIndex, endIndex));
                startIndex = htmlStyle.indexOf("<p>", startIndex + 1);
                endIndex = htmlStyle.indexOf("</p>", endIndex + 1);
            }