Search code examples
pythonluauser-defined-functionsaerospike

Deleting multiple bins from all the records of a set in Aerospike using Aerospike Python Client udf


How can I delete multiple bins from all the records of a set in Aerospike using Aerospike Python Client udf? I tried passing one bin at a time to the udf and used scan to delete the bin from all the records, but this was very inefficient as expected. I also tried creating a list of bins in python and passing the list to the UDF. The following is the code for reference:

Suppose I have 2000 records and 200 bins with names '1', '2', '3' ... etc. I want to delete the bins from '1' to '99'. The namespace in use is testns and the set in use is udfBins. testUdf.lua is the lua file containing the udf and my_udf is the lua function name.

test.py

    scan = client.scan("testns", "udfBins")
    bins = [str(i) for i in range(1,366)]
    # for i in range(1,100):
    scan.apply("testUdf", "my_udf", [bins])
    job_id = scan.execute_background()
    while True:
        response = client.job_info(job_id, aerospike.JOB_SCAN)
        if response["status"] != aerospike.JOB_STATUS_INPROGRESS:
            break
    
    print("job done")

testUdf.lua

function my_udf(rec, bins)

    info(bins)
    for bin in python.iter(bins)
    do
        rec[bin] = nil
    end
    aerospike:update(rec)
end

The above code doesn't work and I'm unable to figure out the reason and the correct way to solve the problem in hand. Any help is highly appreciated.

Thanks a lot in advance


Solution

  • This is bit tricky problem to solve. We have to pass an array from python to lua as an argument to the lua function. Here is the pertinent part of the code that I used to make it work:

    1 - pass the array as a string like so:

    bins = '{"1","2"}'
    # print(bins)
    self.client.scan_apply("test", "users", "testUdf", "my_udf", [bins])
    

    Note: in scan_apply (function name has an underscore, args are passed as a list, here just one arg - the string bins that in lua we convert to a table type and iterate.

    Then in your testUdf.lua, do:

    function my_udf(rec, bins_list)
        bins_list = load("return "..bins_list)()
        for i,bin in ipairs(bins_list)
        do
            -- debug("bins_list_item: "..bin)
            rec[bin] = nil
        end
        aerospike:update(rec)
    end
    

    I used logging at debug level (you had info) to check what the lua code was doing. This worked for me. I created 3 records with bins "1", "2" and "3" and then deleted bins "1" and "2" using scan udf per above.

    Here is sample output on one record after running the scan:

    {'3': 1, '1': 1, '2': 1}  <-- initial bins, 3 records, same bins, same values
    {"1","2"}  <--list that I passed as a string for setting these bins to nil
    {'3': 1}  <-- final bins
    

    I checked with AQL, all 3 records had their bins "1" and "2" deleted.

    aql> select * from test.users
    +---+
    | 3 |
    +---+
    | 1 |
    | 1 |
    | 1 |
    +---+
    3 rows in set (0.123 secs)
    

    This is a good link for further reading: https://discuss.aerospike.com/t/what-is-the-syntax-to-pass-2d-array-values-to-the-record-udf-using-aql/4378