Search code examples
protocol-buffers

How to handle a change in the interpretation of a field in a protobuf message?


If a field stores a specific value and is interpreted in a specific manner, is it possible to change this interpretation in a backwards compatible way?

Let's say I have a field that stores values of different data types. The most generic case is to store it as a byte array and let the apps encode and decode it to the correct data type. Common cases for data types are integers and strings, so support for those types is present. Using a oneof structure this looks as follows:

message Foo
{
    ...
    oneof value
    {
        uint32 integer = 1;
        string text    = 2;
        bytes  data    = 3;
    }
}

Applications that want to store an ip prefix in the value field, have to use the generic data field and do the encoding and decoding correctly.

If I now want to add support for ip prefixes to the Foo message itself so the apps don't have to deal with the encoding and decoding anymore, I could add a new field to the oneof structure with an IpPrefix datatype:

message Foo
{
    ...
    oneof value
    {
        uint32 integer = 1;
        string text    = 2;
        bytes  data    = 3;
        IpPrefix ip_prefix = 4;
    }
}

Even though this makes life easier for the apps, I believe it breaks backwards compatibility. If a sending app has support for the new field, it will put its ip prefix value in the ip_prefix field. But if a receiving app does not have support for this new field yet, it will ignore the field. It will look for the ip prefix value in the data field, as it always did, but it won't find it there. So the receiving app can no longer correctly read the ip prefix value anymore.

Is there a way to make this scenario somehow backwards compatible?

PS: I realize this is a somewhat vague and perhaps unrealistic example. The exact case I need it for is for the representation of RADIUS attributes in a protobuf message. These attributes are in essence a byte array that is sent over the network, but the bytes in the array have meaning and could be stored as different fields in the protobuf message. A basic attribute exists of a Type field and a Value field where the value field can be a string, integer, ip address... From time to time new datatypes (even complex ones) are added and I would like to be able to add new datatypes in a backwards compatible way.


Solution

  • There are two ways to go about this:

    1. Enforce an update schedule, readers before writers

    Add the new type of field to the .proto definition, but document that it should not be used except for testing and reception. Document that all readers of the message must support both the old and the new field by a specific milestone/date, after which the writers can start using it. Eventually you can deprecate the old field and new readers don't need to support it anymore.

    2. Have both fields during the transition period

    message Foo
    {
        ...
        oneof value
        {
            uint32 integer = 1;
            string text    = 2;
            bytes  data    = 3;
        }
    
        IpPrefix ip_prefix = 4;
    }
    

    Document that writers should set both data and ip_prefix during the transition period. The readers can start using ip_prefix as soon as writers have added support, and can optionally fall back to data.

    Later, you can deprecate data and move ip_prefix to inside the oneof without breaking compatibility.