Search code examples
regexregex-lookarounds

How to write regex which matches all characters that does not contain any control characters?


I need to write a regex that would match for words that do not contain any control characters. I read that negative lookahead are used for that and wrote this regex:

/(?!\p{C}+)/

But don't get why it's not working. Expected result:

word without control characters - match

word with control character ‎between - don't match


Solution

  • You can match any control char using \p{C}. You can match any char other than a control char using \P{C}. See a regex demo with your string.

    If you want to match words not glued to some control char, use (?<!\p{C})\b\w+\b(?!\p{C}), see this regex demo. Here, (?<!\p{C}) is a negative lookbehind that matches a location not immediately preceded with a control char, \b\w+\b matches one or more word chars within word boundaries and (?!\p{C}) is a negative lookahead that matches a location not immediately followed with a control char.

    If you want to "exclude" CR and LF chars from the \p{C} pattern you can use (?<![^\P{C}\r\n])\b\w+\b(?![^\P{C}\r\n]), see this regex demo.