I've just started using Regular Expressions
and this is so overwhelming that even after reading documentation I can't seem to find where to start to help with my problem.
I have to a bunch of strings.
"Project1 - Notepad"
"Project2 - Notepad"
"Project3 - Notepad"
"Untitled - Notepad"
"HeyHo - Notepad"
And I have a string containing a wild card.
"* - Notepad"
I would need that if I compare any of these strings with the one containing the wildcard it returns true. (With Regex.IsMatch()
or something like that..)
I don't usually asks for answers like that but I just can't find what I need. Could someone just point me out in the right direction ?
The wildcard *
is equivalent to the Regex pattern ".*"
(greedy) or ".*?"
(not-greedy), so you'll want to perform a string.Replace()
:
string pattern = Regex.Escape(inputPattern).Replace("\\*", ".*?");
Note the Regex.Escape(inputPattern)
at the beginning. Since inputPattern
may contain special characters used by Regex, you need to properly escape those characters. If you don't, your pattern would explode.
Regex.IsMatch(input, ".NET"); // may match ".NET", "aNET", "FNET", "7NET" and many more
As a result, the wildcard *
is escaped to \\*
, which is why we replace the escaped wildcard rather than just the wildcard itself.
you can do either:
Regex.IsMatch(input, pattern);
or
var regex = new Regex(pattern);
regex.IsMatch(input);
The difference is in how much the pattern will try to match.
Consider the following string: "hello (x+1)(x-1) world"
. You want to match the opening bracket (
and the closing bracket )
as well as anything in-between.
Greedy would match only "(x+1)(x-1)"
and nothing else. It basically matches the longest substring it can find.
Not-greedy would match "(x+1)"
and "(x-1)"
and nothing else. In other words: the shortest substrings possible.
@MaximZabolotskikh asked about the possibility of escaping the wildcard character, so that "Hello \* World"
would literally match "Hello * World"
.
To do this would require multiple substitutions.
Escape the regex.
Substitute any occurrence of \\\\
(double backslash) with the escape character \x1b
. This allows us to identify backslashes that were in the original input.
Substitute any occurrence of \x1b\x1b
with \\\\
. This allows matching a literal \
by using \\
.
Use the negative lookbehind pattern (?<!\x1b)\\\*
to substitute *
with the wildcard pattern (either .*
or .*?
) but only if it isn't preceded by a backslash. This will insert the wildcard pattern in "Hello * World"
and "Hello \\* World"
, but not in Hello \* World
. We need to match \\\*
because *
is changed to \*
after escaping, so we're actually matching a literal \
(using \\
) and a literal *
(using \*
).
Any escaped *
will now be \x1b\\*
, which will eventually be substituted to \\\\*
, but we actually want it to be \\*
instead so we can match a literal *
later on. Therefore, substitute \x1b\\*
with \\*
.
Finally, substitute all \x1b
back to \\
.
Here's an example (I am using @""
here to avoid typing double backslashes):
string pattern = Regex.Escape(inputPattern);
pattern = pattern.Replace(@"\\", "\x1b");
pattern = pattern.Replace("\x1b\x1b", @"\\");
pattern = Regex.Replace(pattern, @"(?<!\x1b)\\\*", ".*?");
pattern = pattern.Replace("\x1b\\*", @"\*");
pattern = pattern.Replace('\x1b', '\\');