Search code examples
phpregexunicodecharacterdigits

How can I repeat the unicode character as the digits and characters with \d* and \w*


I have this regular expression:

\d*\w*[\x{0021}-\x{003F}]*

I want to repeat a digit, a character and a specific code point between 0021 and 003f any number of times.

I have seen that with \d*\w* you can make "a1" so the order doesn`t matter but I can only repeat the code point character at the end, how can I make that the order of that repetition doesn't matters like the digits and characters to make strings like: a1!a?23!sd2


Solution

  • Using \w also matches \d, so you can omit that from the character class.

    Note that this part {0021}-\x{003F} also matches digits 0-9 (See the ASCII table Hx value 21-3F) so there is some overlap as well.

    You could split it up in 2 unicode ranges, but that would just make the character class notation longer.

    Changing it to [A-Za-z_\x{0021}-\x{003F}]+ specifies all the used ranges, but if you add the unicode flag in php, using \w matches a lot more than [A-Za-z]

    To match 1 or more occurrences, you could use:

    [\w\x{0021}-\x{003F}]+
    

    See this regex demo and this regex demo.