Search code examples
mysqldatabaseselect

Why do we need another level of [] when using character classes?


See SELECT statement example here.

I also wrote it on my own, but the statement

SELECT prod_name FROM products WHERE prod_name REGEXP '[:digit:]{4}'

also worked. Could someone explain why?

I noticed for character classes in mysql regex, they usually work with another '[]' outside of it.

Why do we need another level of square bracket?


Solution

  • The manual implies that the character class syntax works only inside a bracket expression. But in fact, the implementation works either way.

    mysql> select 'abc123' regexp '[:alpha:]' as result;
    +--------+
    | result |
    +--------+
    |      1 |
    +--------+
    

    But if you wanted to use multiple character classes, you'd need to group them somehow.

    The following doesn't work because the {6} applies only to the immediately preceding entity.

    mysql> select 'abc123' regexp '^[:digit:][:alpha:]{6}$' as result;
    +--------+
    | result |
    +--------+
    |      0 |
    +--------+
    

    The following doesn't work because it's order-sensitive. The following matches exactly one digit followed by exactly one alpha character, repeated six times.

    mysql> select 'abc123' regexp '^([:digit:][:alpha:]){6}$' as result;
    +--------+
    | result |
    +--------+
    |      0 |
    +--------+
    

    The bracket syntax is not order-sensitive, and it matches either of the characters, not both. This pattern matches either digit or alpha character, in any order, six times.

    mysql> select 'abc123' regexp '^[[:digit:][:alpha:]]{6}$' as result;
    +--------+
    | result |
    +--------+
    |      1 |
    +--------+
    

    Compare with how you would write this without the character class syntax:

    mysql> select 'abc123' regexp '^[0-9a-z]{6}$' as result;
    +--------+
    | result |
    +--------+
    |      1 |
    +--------+
    

    It's still inside brackets, to support matching any of the characters in the range.