Search code examples
javaredisjedis

How many commands does jedis pipline exec one time by default?


I was using jedis pipeline to insert batch of data into redis. now I face a confused problem. I want to batch a specific size and then call sync(), but it seems that the pipeline will call sync automatically every 200 records approximately. here are my codes, can anyone tell me if there exists any configurations about this ?

public class RedisClusterTest {
public static void main(String args[]) throws IOException, InterruptedException {
    String host = args[0];
    int port = Integer.valueOf(args[1]);
    int cnt = Integer.valueOf(args[2]);
    Jedis jedis = new Jedis(host, port);

    Pipeline pip = jedis.pipelined();
    for(int i = 0 ; i < 2000; i++) {
        pip.hset("Server", String.valueOf(i), String.valueOf(i));
        Thread.sleep(10);
    }
    // When it end loop, about 1900 records has already been insert into redis, and the last sync only made last remaining data been sync.
    pip.sync()

Solution

  • Pipeline does not wait for your confirmation to send batch to Redis. Documentation says that:

    Sometimes you need to send a bunch of different commands. A very cool way to do that, and have better performance than doing it the naive way, is to use pipelining. This way you send commands without waiting for response, and you actually read the responses at the end, which is faster.

    By summing up it says that pipeline is used without waiting for response and is just sent like a stream.

    I took a look in their source code which confirms their documentation.

    public Pipeline pipelined() {
      pipeline = new Pipeline();
      pipeline.setClient(client);
      return pipeline;
    }
    

    This returns your Pipeline instance. Then you call bunch of HSETs

    public Long hset(final byte[] key, final byte[] field, final byte[] value) {
      checkIsInMultiOrPipeline();
      client.hset(key, field, value);
      return client.getIntegerReply();
    }
    

    Which gets reply rightaway.

    Then you call sync which:

    Synchronize pipeline by reading all responses. This operation close the pipeline. In order to get return values from pipelined commands, capture the different Response<?> of the commands you execute.

    In other words it handles your pipeline instance.

    By summing up, you should not use "pipelined" if you expect from it to send a batch only when you call sync. It is not how it works.