Search code examples
c#parsingdelimiter

How to parse text file with multiple delimiters


I have a text file I'm trying to parse. The data is a single string and uses four types of deliminators. There's also a record count at the end of the file that's not pertinent to the data that I can ignore.

Deliminators:

Beginning of data: ~
Field Seperator: |
End of record: #
End of data: ^

Sample text file:

~001|John|Smith|300#002|Abby|Williams|250#003|Tom|Jones|400#004|Claire|Benton|300^
Count:4

The parsed data should be stored in a list or collection of objects "Account"

public class Account
        {
            public string IdNum { get; set; }
            public string FirstName { get; set; }
            public string LastName { get; set; }
            public string AmtDue { get; set; }
        }

I'm still new to programming so please teach me what would be the best way to parse and store this data? Thanks in advance.


Solution

  • If you know all delimiters, you can use something like this:

    var s = "~001|John|Smith|300#002|Abby|Williams|250#003|Tom|Jones|400#004|Claire|Benton|300^Count: 4";
    
    // select a line between "~" and "^".
    var data = new string(s.SkipWhile(c => c == '~').TakeWhile(c => c != '^').ToArray());
    var records = data.Split('#');
    var accounts = records.Select(record => record.Split('|'))
      .Select(items => new Account
      {
        IdNum = items[0],
        FirstName = items[1],
        LastName = items[2],
        AmtDue = items[3]
      })
      .ToList();
    
    foreach (var account in accounts)
    {
      Console.WriteLine(account.IdNum);
      Console.WriteLine(account.FirstName);
      Console.WriteLine(account.LastName);
      Console.WriteLine(account.AmtDue);
      Console.WriteLine();
    }
    

    Just first of all split by row delimiters, then split by field delimiter.

    P.S. Don't forget check variables on null or array length like you need.