Given a messageformat string such as str
below. I want to be able to get the "notifications" & "name" values which are used to display the text values.
var str = @"You have {notifications, plural,
zero {no notifications}
one {one notification}
=42 {a universal amount of notifications}
other {# notifications}
}. Have a nice day, {name}!";
I have tried using a regex such as:
var matches = Regex.Matches(str, @"{(.*?)}");
//var matches = Regex.Matches(str, @"(?<=\{)[^}{]*(?=\})");
var results = matches.Cast<Match>().Select(m => m.Groups[1].Value).Distinct().ToList();
But the above does not take in to account that {notifications,..
is itself wraped in curly braces and includes the inner values that are not needed, which are wrapped in curly braces also.
So in brief I just want to be able to parse a string such as str
above and get notifications
& name
at the returned values.
A string such as var str2 = @"Hello {name}"
should just return name
as the value.
EDIT
The values notifications
& name
will not be known in advance - I have just used this as an example, for the values I require to return from the string.
var str = @"You have {notifications, plural,
zero {no notifications}
one {one notification}
=42 {a universal amount of notifications}
other {# notifications}
}. Have a nice day, {name}!";
// get matches skipping nested curly braces
var matches =
Regex.Matches(str, @"{((?:[^{}]|(?<counter>{)|(?<-counter>}))+(?(counter)(?!)))}");
var results = matches.Cast<Match>().Select(m => m.Groups[1].Value).Distinct()
.Select(v => Regex.Match(v, @"^\w+").Value) // take 1st word
.ToList();
which results in (copied from Visual Studio Locals window while debugging)
results Count = 2 System.Collections.Generic.List<string>
[0] "notifications"
[1] "name"
... original answer follows ...
One thing to note about the current solution in the original question:
.
doesn't match line breaks, so that's one reason why it currently matches the nested values (see this source)(this article addresses the main challenge noted in the original question--the nested curly braces)
var str = @"You have {notifications, plural,
zero {no notifications}
one {one notification}
=42 {a universal amount of notifications}
other {# notifications}
}. Have a nice day, {name}!";
// get matches skipping nested curly braces
var matches =
Regex.Matches(str, @"{((?:[^{}]|(?<counter>{)|(?<-counter>}))+(?(counter)(?!)))}");
var results = matches.Cast<Match>().Select(m => m.Groups[1].Value).Distinct().ToList();
which results in (copied from Visual Studio Locals window while debugging)
results Count = 2 System.Collections.Generic.List<string>
[0] "notifications, plural,\r\n zero {no notifications}\r\n one {one notification}\r\n =42 {a universal amount of notifications}\r\n other {# notifications}\r\n "
[1] "name"
(or if you were to print these results to the console):
// Result 0 would look like:
notifications, plural,
zero {no notifications}
one {one notification}
=42 {a universal amount of notifications}
other {# notifications}
// Result 1 would look like:
name
I came back to this and realized that the question asked for just the single words as results.
(I'm repeating the above snippet with the additional select statement to show the full solution)
var str = @"You have {notifications, plural,
zero {no notifications}
one {one notification}
=42 {a universal amount of notifications}
other {# notifications}
}. Have a nice day, {name}!";
// get matches skipping nested curly braces
var matches =
Regex.Matches(str, @"{((?:[^{}]|(?<counter>{)|(?<-counter>}))+(?(counter)(?!)))}");
var results = matches.Cast<Match>().Select(m => m.Groups[1].Value).Distinct()
.Select(v => Regex.Match(v, @"^\w+").Value) // take 1st word
.ToList();
which results in (copied from Visual Studio Locals window while debugging)
results Count = 2 System.Collections.Generic.List<string>
[0] "notifications"
[1] "name"
(I just found this interesting and spent a little more time researching/learning and thought it worth including some more related information)
Conversations here and here include some opinions for and against using regex for this type of problem.
Regardless of the above opinions, .NET creators deemed it appropriate to implement balancing group definitions--a functionality this answer uses: