I'm developing a page using modx revolution. It's a complete cms with a lot of built in functions. If I create a page in the manager it will automatically produce a friendly url for me pointing to that page.
The problem is that is does not deny the special characters we have in Norway, æøå (and uppercase ÆØÅ).
The system got a built in regex-pattern to strip the url for most bad characters, but I need the experession to strip æøå and ÆØÅ too.
The pattern looks like this:
/[\0\x0B\t\n\r\f\a&=+%#<>"~:`@\?\[\]\{\}\|\^'\\]/
Can anyone use their magic regex-knowledge to include these 6 letters? I am totally green at regex, and simply adding the letters in there did not seem to work.
PS: Please don't use the common "boo, don't use regex for this" here. The pattern is there for a reason, and i don't want to mess around with the core if we have to upgrade modx (which is pretty likely to happen sooner or later).
Try to use Unicode. I don't know modx, but since its written in php, I hope it uses php preg regular expressions.
/[\0\x0B\t\n\r\f\a&=+%#<>"~:`@\?\[\]\{\}\|\^'\\\x{00C6}\x{00E6}\x{00C5}\x{00E5}\x{00D8}\x{00F8}]/u
The u
modifier tells php to use unicode matching mode, it then interprets the regular expression as unicode string.
\x{00C6}
is the Unicode character Æ
Please check the code of the other characters by yourself to ensure I didn't made a mistake while looking them up.
See regular-expression.info for the unicode usage in php
Unicode.org for the code point