Search code examples
c#regexlexerparser-generator

Parsing custom data tokens and replacing with values in C#


I have about 10 pieces of data from a record and I want to be able to define the layout of a string where this data is returned, with the option of leaving some pieces out. My thought was to use an enum to give integer values to my tokens/fields and then have a format like {0}{1}{2}{3} or something as complicated as {4} - {3}{1} [{8}]. The meaning of the tokens relates to fields in my database. For instance I have this enum for my tokens relating to payments made.

AccountMask = 0,
AccountLast4 = 1,
AccountFirstDigit = 2,
AccountFirstLetter = 3,
ItemNumber = 4,
Amount = 5

Account mask is a string like VXXXXX1234 where the V is for a visa, and 1234 are the last 4 digits of the card. Sometimes clients wants the V, sometimes they want the first digit (It's easy to translate a card type into a first digit).

My goal is to create something reusable to generate a string using tokens in a format string that will then use the data associated with the digit inside the token to do an in place replace of the data.

So, for an example using the mask above and my enum if I wanted to define a format 9{2}{1}{4:[0:0000000000]}

if the item number is 678934

which would then translate to 9412340000678934 where the inner part of token 4 becomes a definition for a String.Format of that value. Also, the data placed around the tokens is ignored and kept in place.

My issue comes to the manipulation of the string and the best practice. I have been told that regular expressions can be costly if you are going to create them on the fly. As a CS major, I have a feeling the "right" (however complex) solution is to make a lexer/parser for my tokens. I have no experience writing a lexer/parse in C# so I'm not sure of the best practices around it. I'm looking for guidance here on a system that is efficient and easy to tweak.


Solution

  • I ended up putting the regex as a static object in the class and then looping through matches to perform replacements and build out my token.

    var token = request.TokenFormat;
    
    var matches = tokenExpression.Matches(request.TokenFormat);
    
    foreach (Match match in matches)
    {
        var value = match.Value;
        var tokenCode = (Token)Convert.ToInt32(value.Substring(1, (value.Contains(":") ? value.IndexOf(":") : value.IndexOf("}")) - 1));
    
        object data = null;
    
        switch (tokenCode)
        {
            case Token.AccountMask:
                data = accountMask;
                break;
            case Token.AccountLast4:
                data = accountMask.Substring(accountMask.Length - 4);
                break;
            case Token.AccountFirstDigit:
                string firstLetter = accountMask.Substring(0, 1);
    
                switch (firstLetter.ToUpper())
                {
                    case "A":
                        data = 3;
                        break;
                    case "V":
                        data = 4;
                        break;
                    case "M":
                        data = 5;
                        break;
                    case "D":
                        data = 6;
                        break;
                }
    
                break;
            case Token.AccountFirstLetter:
                data = accountMask.Substring(0, 1);
                break;
            case Token.ItemNumber:
                if(item != null)
                    data = item.PaymentId;
                break;
            case Token.Amount:
                if (item != null)
                    data = item.Amount;
                break;
            case Token.PaymentMethodId:
                if (paymentMethod != null)
                    data = paymentMethod.PaymentMethodId;
                break;
        }
    
        if (formatExpression.IsMatch(value))
        {
            Match formatMatch = formatExpression.Match(value);
            string format = formatMatch.Value.Replace("[", "{").Replace("]", "}");
    
            token = token.Replace(value, String.Format(format, data));
        }
        else
        {
            token = token.Replace(value, String.Format("{0}", data));
        }
    }
    
    return token;