Search code examples
c#substring

Have to add magic number to length in .Substring


I have a data structure that I read over UDP, and it looks like this:

$T_DATA,17,<?xml version="1.0"?>
<TData id="Channel 4">
  <Meta>
    <InstrumentID>17</InstrumentID>
    <DatagramID>4</DatagramID>
    <Timestamp>2024-09-10 15:00:57.480</Timestamp>
  </Meta>
  <Data>
    <Value ID="1" type="0">108.33</Value>
    <Value ID="2" type="0">-39</Value>
    <Value ID="3" type="0">422.9</Value>
  </Data>
</TData>
????

I want to remove everything before <Tdata and everything after </Tdata>

My code that does exactly that looks like this:

static string StripNonXmlContent(string input)
{
    // Find the start of the XML tag and ignore anything before it
    string stopString = "</TData>";
    int xmlStartIndex = input.IndexOf("<TData");
    // TODO: Explain magic number 7. Without it return string gets too short.
    int xmlStopIndex = input.LastIndexOf(stopString) + stopString.Length + 7;

    // If both the start and end of the XML are found
    if (xmlStartIndex >= 0 && xmlStopIndex >= 0)
    {
        // Extract everything between <TData> and </TData>
        return input.Substring(xmlStartIndex, (xmlStopIndex - xmlStartIndex));
    }

    // If no valid XML structure is found, return the input as is
    return input;
}

In the beginning my last index looked like: xmlStopIndex = input.LastIndexOf(stopString) + stopString.Length;, but that cut the return too short, the last line became just <T. and through trial and error I found out that by adding the magic number 7 I got it to work. Can somebody explain to me why this is?


Further investigation. The string actually looks like this:

???$?T?C?L?I?E?N?T?_?D?A?T?A?,?1?7?,?<???x?m?l? ?v?e?r?s?i?o?n?=?"?1?.?0?"???>?
?<?T?D?a?t?a? ?i?d?=?"?C?h?a?n?n?e?l? ?5?"?>?
? ? ? ? ?<?M?e?t?a?>?
? ? ? ? ? ? ? ? ?<?I?n?s?t?r?u?m?e?n?t?I?D?>?1?7?<?/?I?n?s?t?r?u?m?e?n?t?I?D?>?
? ? ? ? ? ? ? ? ?<?D?a?t?a?g?r?a?m?I?D?>?5?<?/?D?a?t?a?g?r?a?m?I?D?>?
? ? ? ? ? ? ? ? ?<?T?i?m?e?s?t?a?m?p?>?2?0?2?4?-?0?9?-?1?0? ?1?5?:?3?4?:?5?6?.?4?8?6?<?/?T?i?m?e?s?t?a?m?p?>?
? ? ? ? ?<?/?M?e?t?a?>?
? ? ? ? ?<?D?a?t?a?>?
? ? ? ? ? ? ? ? ?<?V?a?l?u?e? ?I?D?=?"?1?"? ?t?y?p?e?=?"?0?"?>?1?0?5?.?7?4?<?/?V?a?l?u?e?>?
? ? ? ? ? ? ? ? ?<?V?a?l?u?e? ?I?D?=?"?2?"? ?t?y?p?e?=?"?0?"?>?-?3?9?<?/?V?a?l?u?e?>?
? ? ? ? ? ? ? ? ?<?V?a?l?u?e? ?I?D?=?"?3?"? ?t?y?p?e?=?"?0?"?>?3?3?5?.?4?<?/?V?a?l?u?e?>?
? ? ? ? ?<?/?D?a?t?a?>?
?<?/?T?D?a?t?a?>?
????

And there are seven ? inside the string </TData> which explains why I have to add 7 to make this work.

There must be something wrong with how I read the data. It looks like this:

byte[] receiveBytes = udpClient.Receive(ref remoteEndPoint);
string receivedData = Encoding.ASCII.GetString(receiveBytes, 0, receiveBytes.Length);
Console.WriteLine($"Received data from {remoteEndPoint}:");
Console.WriteLine(receivedData);

Solution

  • Looks like your data is arriving as UTF16, but you're reading it as ASCII. Try reading the data as UTF16.