Search code examples
c++regexqtqregexp

Detecting text like "#smth" with RegExp (with some more terms)


I'm really bad in regular expressions, so please help me.

I need to find in string any pieces like #text.

text mustn't contain any space characters (\\s). It's length must be at least 2 characters ({2,}), and it must contain at least 1 letter(QChar::isLetter()).

Examples:

  • #c, #1, #123456, #123 456, #123_456 are incorrect
  • #cc, #text, #text123, #123text are correct

I use QRegExp.


Solution

  • Styne666 gave the right regex.

    Here is a little Perl script which is trying to match its first argument with this regex:

        #!/usr/bin/env perl
        use strict;
        use warnings;
        my $arg = shift;
        if ($arg =~ m/(#(?=\d*[a-zA-Z])[a-zA-Z\d]{2,})/) {
            print "$1 MATCHES THE PATTERN!\n";
        } else {
            print "NO MATCH\n";
        }
    

    Perl is always great to quickly test your regular expressions.

    Now, your question is a bit different. You want to find all the substrings in your text string, and you want to do it in C++/Qt. Here is what I could come up with in couple of minutes:

        #include <QtCore/QCoreApplication>
        #include <QRegExp>
        #include <iostream>
    
        using namespace std;
    
        int main(int argc, char *argv[])
        {
            QString str = argv[1];
            QRegExp rx("[\\s]?(\\#(?=\\d*[a-zA-Z])[a-zA-Z\\d]{2,})\\b");
    
            int pos = 0;
            while ((pos = rx.indexIn(str, pos)) != -1)
            {
                QString token = rx.cap(1);
                cout << token.toStdString().c_str() << endl;
                pos += rx.matchedLength();
            }
    
            return 0;
        }
    

    To make my test I feed it an input like this (making a long string just one command line argument):

        peter@ubuntu01$ qt-regexp "#hjhj  4324   fdsafdsa  #33e #22"
    

    And it matches only two words: #hjhj and #33e.

    Hope it helps.