Search code examples
c#parsingtext-parsing

Parsing Nested Text in C Sharp


If I have a series of strings that have this base format:

"[id value]"//id and value are space delimited.  id will never have spaces

They can then be nested like this:

[a]
[a [b value]]
[a [b [c [value]]]

So every item can have 0 or 1 value entries.

What is the best approach to go about parsing this format? Do I just use stuff like string.Split() or string.IndexOf() or are there better methods?


Solution

  • there is nothing wrong with split and indexof methods, they exist for string parsing. Here is a sample for your case:

            string str = "[a [b [c [d value]]]]";
    
            while (str.Trim().Length > 0)
            {
                int start = str.LastIndexOf('[');
                int end = str.IndexOf(']');
    
                string s = str.Substring(start +1, end - (start+1)).Trim();
                string[] pair = s.Split(' ');// this is what you are looking for. its length will be 2 if it has a value
    
                str = str.Remove(start, (end + 1)- start);
            }