Search code examples
c++regexqtqregexp

QRegExp Pattern For URLs


I am trying to match google urls from some text that is stored in a variable, using the pattern below.

The urls use double quotes

QRegExp regExp;
regExp.setPattern("http://www.google.com/(.*)");

I manage to match the url but it unwontedly matches all of the text that is contained after it. I have tried using similar variants like the ones below, but they don't seem to work.

regExp.setPattern("http://www.google.com/(.*)\"is"); 
regExp.setPattern("http://www.google.com/^(.*)$\"");

Any help to get a regular expression that matches just the url alone.

Thanks in advance


Solution

  • Even though it is impossible for us to know what is around the urls in your text (quotes ? parenthesis ? white spaces ?), we can create a better regular expression by trying to do a negative match of characters that cannot be part of the url:

    QRegExp regExp;
    regExp.setPattern("http://www.google.com/([^()\"' ]*)");
    

    Then you just need to add more possible characters to this negative character class.