Search code examples
c#regextoken

What is the correct Regex for this?


I have a text like this:

(0.2 furry
  (0.81 fast
    (0.3)
    (0.2)
  )
  (0.1 fishy
    (0.3 freshwater
      (0.01)
      (0.01)
    )
    (0.1)
  )
)

I want to split it using Regex.Split. The format I want the string[] to be is:

( 0.2 furry ( 0.81 fast ( 0.3 ) ( 0.2 ) ) ( ... (you get the point)

I'm using the Regex expression ([()\\s]), but this also gives me strings for the spaces. Could you please tell me the correct Regex expression?


Solution

  • You can either match (as others have answered) or split on spaces and parentheses:

    See regex in use here

    Note that I replaced matches with \n for display purposes in the link above.
    You would instead split on the pattern below.

    \s+|(?<=\()(?!\s)|(?<!\s)(?=\))
    

    This pattern is composed of 3 options:

    1. \s+ match any whitespace character one or more times
    2. (?<=\()(?!\s) Match a position that is preceded by ( but not proceeded by whitespace (because the first option will already have matched this position)
    3. (?<!\s)(?=\)) Match a position that is not preceded by whitespace (because the first option will already have matched this position), but that is proceeded by )