Search code examples
c++perlthrift

Apache Thrift: When use "optional' before a list, C++ server seems not to return it correctly


Hi! every one, I have a problem similar with this one:

regarding thrift function return list

When a list has a "optional" modifier, the thrift C++ server will always return it as null/undef.

Additionally, if a struct contained by a optional list, any of the struct's field, can't be set to "optional", or, a null/undef value will be returned.

After deleting all the "optional" modifier, everything is OK.

Could anybody tell me why I can't use "optional" before a list ?


Here is the thrift file:

namespace cpp thrifttest
namespace perl thrifttest

struct Pair {
    1: optional string a, 
    2: optional string b
}

struct Result {
    1: optional list<string> strList,
    2: optional list<Pair> pairList
}

service Test {

    Result listTest()

}

Here is the C++ code (server):

#include "gen-cpp/Test.h"
#include <thrift/protocol/TBinaryProtocol.h>
#include <thrift/server/TSimpleServer.h>
#include <thrift/transport/TServerSocket.h>
#include <thrift/transport/TBufferTransports.h>

#include <iostream>

using namespace std;

using namespace ::apache::thrift;
using namespace ::apache::thrift::protocol;
using namespace ::apache::thrift::transport;
using namespace ::apache::thrift::server;

using boost::shared_ptr;

using namespace  ::thrifttest;

class TestHandler : virtual public TestIf {
    public:
        TestHandler() {
            // Your initialization goes here
        }

        void listTest(Result& _return) {

            _return.strList.push_back("Test");
            _return.strList.push_back("one level list");
            cout << "strList size: " << _return.strList.size() << endl;

            Pair pair;
            pair.a = "Test";
            pair.b = "two level list";
            _return.pairList.push_back(pair);
            cout << "pairList size: " << _return.pairList.size() << endl;

            printf("call listTest\n");
        }

};

int main(int argc, char **argv) {
    int port = 9595;
    shared_ptr<TestHandler> handler(new TestHandler());
    shared_ptr<TProcessor> processor(new TestProcessor(handler));
    shared_ptr<TServerTransport> serverTransport(new TServerSocket(port));
    shared_ptr<TTransportFactory> transportFactory(new TBufferedTransportFactory());
    shared_ptr<TProtocolFactory> protocolFactory(new TBinaryProtocolFactory());

    TSimpleServer server(processor, serverTransport, transportFactory, protocolFactory);
    server.serve();
    return 0;
}

Here is the perl code (client):

#!/usr/bin/perl

use v5.12;
use warnings;
use autodie;

use utf8;
use Data::Dumper;

use lib 'gen-perl';

use thrifttest::Test;
use thrifttest::Constants;
use thrifttest::Types;

use Thrift;
use Thrift::BinaryProtocol;
use Thrift::Socket;
use Thrift::BufferedTransport;


my $socket    = new Thrift::Socket('localhost', 9595);
my $transport = new Thrift::BufferedTransport($socket, 1024, 1024);
my $protocol  = new Thrift::BinaryProtocol($transport);
my $client    = new thrifttest::TestClient($protocol);


eval {
    $transport->open();
    my $result = $client->listTest;
    say Dumper($result);
    $transport->close();
};
say $@ if $@;

C++ server output:

strList size: 2
pairList size: 1
call listTest

perl client output:

$VAR1 = bless( {
                 'pairList' => undef,
                 'strList' => undef
               }, 'thrifttest::Result' );

PS: My development environment is CentOS 7, GCC 4.8.3, Perl 5.16, thrift 0.9.3


Solution

  • I was wrong, it is not a real bug, it is just some unfortunate design that is not really fool proof and allows the user to make Dumb Things™. The problem lies in the semantics of the optional operator and how it is implemented for C++.

    Lets asssume we have this piece of IDL:

    struct Xtruct2
    {
      1: i8     byte_thing,  
      2: Xtruct struct_thing,
      3: i32    i32_thing
    }
    

    The generated code, in particular the write() method, looks like this:

    uint32_t Xtruct2::write(::apache::thrift::protocol::TProtocol* oprot) const {
      //...
      xfer += oprot->writeFieldBegin("struct_thing", ::apache::thrift::protocol::T_STRUCT, 2);
      xfer += this->struct_thing.write(oprot);
      xfer += oprot->writeFieldEnd();
      //...
    }
    

    If we now modify the IDL and add the optional specifier:

    struct Xtruct2
    {
      1: i8     byte_thing,  
      2: optional Xtruct struct_thing,
      3: i32    i32_thing
    }
    

    The generated code looks slightly different:

    uint32_t Xtruct2::write(::apache::thrift::protocol::TProtocol* oprot) const {
      //...
      if (this->__isset.struct_thing) {
        xfer += oprot->writeFieldBegin("struct_thing", ::apache::thrift::protocol::T_STRUCT, 2);
        xfer += this->struct_thing.write(oprot);
        xfer += oprot->writeFieldEnd();
      }
      //...
    }
    

    Thrift has three kinds of requiredness: required, optional and default. The latter is assumed implicitly if neither required nor optional was specified (that's why there is no special keyword for default requiredness). The semantics vis-à-vis reading and writing those fields is as follows:

        requiredness        write field?        read field?         
        ----------------------------------------------------------------------
        required            always              always, must be present
        (default)           always              if present, may be missing
        optional            only if set         if present, may be missing
    

    So what optional changes compared to the default is the behaviour of the write method. While a default field is always written, an optional field is only written conditionally. That is checked by means of the __isset flags, which consist of a bitset, one bit for each field that is not required. If the corresponding bit flag within __isset is set, the field value can be used. If the bit flag is not present, the field value has not been initialized and thus should not be used.

    So far this would not be a much of a problem. But there's a trap that you managed to hit: Default and optional fields can be accessed and used in C++ even if the bit flag is not set because they are just there. In case of default requiredness, this is not a big deal: You assign your values, and since the field is always written, basically nothing bad happens. The bit flag for the field is set when the field is deserialized (see generated code).

    However, things change when you opt-in to optional: Now all of a sudden you are responsible to set the flag properly, either by accessing __isset directly or by means of the generated setter method, in our case this one:

    void Xtruct2::__set_struct_thing(const Xtruct& val) {
      this->struct_thing = val;
    __isset.struct_thing = true;
    }
    

    My assumption was that it should not be possible to access an optional field that has not been set, but it turned out that this seems to be by design. I still think that the design is a bit too much error prone here.