Search code examples
c#regexsplitedifact

Regex.Split ignores empty results


I have this string:

IMD+F++:::PS4 SAINTS R IV R?+GA'

I would like to split it up in two steps. First I would like to split on + except escaped plusses "?+". Second I want to split the result on :, except escaped colons "?:".

With the following Regex I can successfully split my string:

string[] Data = Regex.Split("IMD+F++:::PS4 SAINTS R IV R?+GA'", @"(?<![\?])[\+]+"); 

result:

[0] IMD
[1] F
[2] :::PS4 SAINTS R IV R?+GA'

The result is incorrect. It should be 4 inputs into the array. It removes empty resuls. I need the empty results to stay in the array. The result should be:

[0] IMD
[1] F
[2]
[3] :::PS4 SAINTS R IV R?+GA'

Does anyone know why it behaves this way? Any suggestions?


Solution

  • You're explicitly saying that you want to split on "at least one plus" - that's what [\+]+ means. That's why it's treating ++ as a single separator. Just split on a single plus - and note that you don't need to put that into a set of characters:

    string[] data = Regex.Split("IMD+F++:::PS4 SAINTS R IV R?+GA'", @"(?<!\?)\+");
    

    If you do want to put it into a set of characters, you don't need to escape it -the only reason for escaping it above is to say "this isn't a group quantifier, it's just a plus character". So this is equally good:

    string[] data = Regex.Split("IMD+F++:::PS4 SAINTS R IV R?+GA'", @"(?<![?])[+]");