Search code examples
c#unicode-stringconverters

How i can split Data from String with Unicode?


Good morning, I have a question. I need to recover Data from a String with Unicode for example

"\u001f\u0001\u0013FERREIRA RAMOS MUZI\u001f\u0002\0\u001f\u0003\aRICARDO\u001f\u0004\u0003URY\u001f\u0005\b09031979\u001f\u0006\u000eMONTEVIDEO/URY\u001f\a\b34946682\u001f\b\u0004\"\a \u0016\u001f\t\b22072026\u001f\n\0"

The String in Bytes

1F011346455252454952412052414D4F53204D555A491F02001F03075249434152444F1F04035552591F050830393033313937391F060E4D4F4E5445564944454F2F5552591F070833343934363638321F0804220720161F090832323037323032361F0A00

I need to recover Name, LastName etc in an ArrayList or Arraystring, for example

string[] array = {"Stephen", "King","11301958","NewYork/Usa"}

My problem if i use

System.Text.Encoding.UTF8.GetString(ByteArray);

to Get Data, i only get Name and Last Name, no Dates or where from.

How i can get that from this string?


Solution

  • My Solution:

    Detect only Letters a-zA-Z and Numbers with Regular Expression If regular expresion fail or is a white Space, a Word is Complet and next add it to a List, at the end i have a List With all words and numbers necessary.

    1- Convert Byte[] Data to string

    // Convert utf-8 bytes to a string.
    s_unicode2 = System.Text.Encoding.UTF8.GetString(apduRsp.Data);
    
    List<string> test = new List<string>();
    if (s_unicode2.Length > 0)
    {
       test = GetWords(s_unicode2);
    }
    

    2- Call GetWords() with string converted from Byte[]

    private List<string> GetWords(string text)
        {
            Regex reg = new Regex("[a-zA-Z0-9]");
            string Word = "";
            char[] ca = text.ToCharArray();
            List<string> characters = new List<string>();
            for (int i = 0; i < ca.Length; i++)
            {
                char c = ca[i];
                if (c > 65535)
                {
                    continue;
                }
                if (char.IsHighSurrogate(c))
                {
                    i++;
                    characters.Add(new string(new[] { c, ca[i] }));
                }
                else
                {
                    if (reg.Match(c.ToString()).Success || c.ToString() == "/")
                    {
                        Word = Word + c.ToString();
                        //characters.Add(new string(new[] { c }));
                    }
                    else if(c.ToString() == " ")
                    {
                        if(Word.Length > 0)
                            characters.Add(Word);
                        Word = "";
                    }
                    else
                    {
                        if(Word.Length > 0)
                            characters.Add(Word);
                        Word = "";
                    }
    
                }
    
            }
            return characters;
        }
    

    3- Result from GetWords()

    List<string> values returned

    That solution for me at the moment is good, but some people have 2 names, and this is a little problem at the moment of showing.