Search code examples
c#asp.netasp.net-mvcasp.net-core

How to achieve key/value pair max accurately with simple string using c#


I have a simple string value line that contains the below content.

logo Name raj mobile 9038874774 address 6-98 india bill auto generated

Now I am trying key/Value pair to achieve my detail and the pair value output expecting like below

[0] Key: Name  value:Raj
[1] Key: Mobile value:9038874774
[2] Key: Address value:6-98 india

Below is code trying to achieve requirement

string[] lines = new string[] { "logo Name raj mobile 9038874774 address 6-98 india bill auto generated" };
   
// Get the position of the empty sign within each line

var pairs = lines.Select(l => new { Line = l, Pos = l.IndexOf(" ") });

// Build a dictionary of key/value pairs by splitting the string at the empty sign
var dictionary = pairs.ToDictionary(p => p.Line.Substring(0, p.Pos), p => p.Line.Substring(p.Pos + 1));

// Now you can retrieve values by key:
var value1 = dictionary["Name"]; 

Below is output looks like in the debugger

enter image description here

The text string contains some non-required words like logo and bill auto generated no need for these words to be required into key/value pairs. Please suggest how to achieve this max accurately and the data of string getting from the image file converted into a text string using terrasact OCR


Solution

  • Here's an example using string.Split. I changed line to match the case of the keys, so you might have to deal with case issues. Also, I'm assuming Bill is a key that is safe to be ignored (same concern that @Mark Seemann raised in the comments.)

    There are other potential key issues though, for example, what if the name value is Bill?

    private static readonly HashSet<string> _extractKeys = new() { "Name", "Mobile", "Address" };
    private static readonly HashSet<string> _ignoredKeys = new() { "Bill" };
    
    public static void Main(string[] args)
    {
        var line = "logo Name raj Mobile 9038874774 Address 6-98 india Bill auto generated";
        var splitLine = line.Split(' ');
    
        var pairs = new Dictionary<string, string>();
    
        for (var i = 0; i < splitLine.Length; i++)
        {
            var candidateKey = splitLine[i];
            if (!_extractKeys.Contains(candidateKey))
            {
                continue;
            }
    
            var value = "";
            for (var v = i + 1; v < splitLine.Length; v++)
            {
                var candidateValuePart = splitLine[v];
                if (_ignoredKeys.Contains(candidateValuePart) || _extractKeys.Contains(candidateValuePart))
                {
                    i = v - 1;
                    break;
                }
    
                value = value + candidateValuePart + " ";
            }
    
            pairs.Add(candidateKey, value.Trim());
        }
    
        foreach (var kv in pairs)
        {
            Console.WriteLine("{0}: {1}", kv.Key, kv.Value);
        }
    }