Search code examples
redisprotocolsbulkbulk-load

Redis mass insertion: protocol vs inline commands


For my task I need to load a bulk of data into Redis as soon as possible. It looks like this article is right about my case: https://redis.io/topics/mass-insert

The article starts from giving an example of using multiple inline SET commands with redis-cli. Then they proceed to generating Redis protocol and again use it with redis-cli. They don't explain the reasons or benefits of using Redis protocol.

Using of Redis protocol is a bit harder and it generates a bit more traffic. I wonder, what are the reasons to use Redis protocol rather than simple one-line commands? Probably despite the fact the data is larger, it is easier (and faster) for Redis to parse it?


Solution

  • Good point.

    Only a small percentage of clients support non-blocking I/O, and not all the clients are able to parse the replies in an efficient way in order to maximize throughput. For all this reasons the preferred way to mass import data into Redis is to generate a text file containing the Redis protocol, in raw format, in order to call the commands needed to insert the required data.

    What I understood is that you emulate a client when you use Redis protocol directly, which would benefit from the highlighted points.

    Based on the docs you provided, I tried these scripts:

    test.rb

    def gen_redis_proto(*cmd)
        proto = ""
        proto << "*"+cmd.length.to_s+"\r\n"
        cmd.each{|arg|
            proto << "$"+arg.to_s.bytesize.to_s+"\r\n"
            proto << arg.to_s+"\r\n"
        }
        proto
    end
    (0...100000).each{|n|
        STDOUT.write(gen_redis_proto("SET","Key#{n}","Value#{n}"))
    }
    

    test_no_protocol.rb

    (0...100000).each{|n|
        STDOUT.write("SET Key#{n} Value#{n}\r\n")
    }
    

    • ruby test.rb > 100k_prot.txt
    • ruby test_no_protocol.rb > 100k_no_prot.txt
    • time cat 100k.txt | redis-cli --pipe
    • time cat 100k_no_prot.txt | redis-cli --pipe

    I've got these results:

    teixeira: ~/stackoverflow $ time cat 100k.txt | redis-cli --pipe
    All data transferred. Waiting for the last reply...
    Last reply received from server.
    errors: 0, replies: 100000
    
    real    0m0.168s
    user    0m0.025s
    sys 0m0.015s
    (5 arquivo(s), 6,6Mb)
    
    teixeira: ~/stackoverflow $ time cat 100k_no_prot.txt | redis-cli --pipe
    All data transferred. Waiting for the last reply...
    Last reply received from server.
    errors: 0, replies: 100000
    
    real    0m0.433s
    user    0m0.026s
    sys 0m0.012s