My thrift definition is something like this:
list<i32> getValues()
Implemented it in C++.
Server.cpp has the following piece of code:
.....
std::vector<int32_t> store;
TransferServiceHandler() {
for(int i=0;i<100000000;i++)
store.push_back(i);
}
void getValues(std::vector<int32_t> & _return) {
// Your implementation goes here
_return = store;
}
.....
Client.cpp has a simple loop in which it calls the getValues():
for(int k=0;k<10;k++){
clock_gettime(CLOCK_REALTIME, &ds_spec);
int64_t dstarted = ds_spec.tv_sec * 1000 + (ds_spec.tv_nsec / 1.0e6);
std::vector<int32_t> values;
client.getValues(values);
clock_gettime(CLOCK_REALTIME, &de_spec);
int64_t dended = de_spec.tv_sec * 1000 + (de_spec.tv_nsec / 1.0e6);
std::cout << "Values size :" << values.size() << " in " << (dended - dstarted) << " ms\n";
}
Connections are initialized and closed outside the loop.
Usually few hundred thousand entries are returned by this call.
When there is no data (when the lists are empty) i can see the call happening in 1ms-2ms, when i vary data there's a unpredictable delay in the transfer. Both the client and server are running on the same machine (equipped with 10Gb/s Ethernet, 8 cores and 30 GB of memory).
How do you normally debug a situation like this? I don't think the issue is with the network since its a 10 Gigs machine and size of the data is hardly few MBs.
I ran a benchmark with various data size and you can see the delay isn't stable for each call.
I made significant improvement in the performance after transferring the data as binary rather than vector.
On the thrift definition file, changed the list to binary.
Here's the new benchmark on the same amount of data: