Search code examples
c#stringencodingtcpnetwork-programming

C# TCP sending string with special char removes last char


When sending a string containing special chars, one char is removed from the end of the string, per special char in the sent string. IE: åker -> åke, ååker -> ååk

Sending strings with no special chars returns the full string as expected.

My string packets are writen/read like so:

public void Write(string _value)
{
    Write(_value.Length); // Add the length of the string to the packet
   
    //buffer.AddRange(Encoding.ASCII.GetBytes(_value)); // Add the string itself
    buffer.AddRange(Encoding.UTF8.GetBytes(_value)); // Add the string itself
}

public string ReadString(bool _moveReadPos = true)
{
    try
    {
        int _length = ReadInt(); // Get the length of the string
        //string _value = Encoding.ASCII.GetString(readableBuffer, readPos, _length); // Convert the bytes to a string
        string _value = Encoding.UTF8.GetString(readableBuffer, readPos, _length); // Convert the bytes to a string
        Debug.Log("Value: " + _value);
        if (_moveReadPos && _value.Length > 0)
        {
            // If _moveReadPos is true string is not empty
            readPos += _length; // Increase readPos by the length of the string
        }
        return _value; // Return the string
    }
    catch
    {
        throw new Exception("Could not read value of type 'string'!");
    }
}

As you can see, I've tried with both ASCII and UTF8, with ASCII special chars are just replaced with '?' and the string is not cut off.

What can I do to fix this issue? Let me know if more code is needed.


Solution

  • You're writing the length of the string, not the length of the encoded bytes. With multibyte characters, these will be different.

    public void Write(string _value)
    {
        var bytes = Encoding.UTF8.GetBytes(_value);
        Write(bytes.Length); 
        buffer.AddRange(bytes); 
    }