Search code examples
lualua-table

Table Filter in Lua for multidimensional tables


I am designing a generic table filter which can remove entries from a given table, the problem is that keys are not unique to accomplish this and types are also different.

Let me elaborate more clearly with an example

SampleTable = {
    { key1 = 10, key2 = 'name_1', Str = 'sample_string_1'     },
    { key1 = 10, key2 = 'name_3', Str = 'sample_string_2'     },
    { key1 = 11, key2 = 'name_2', Mac = {'ID1', 'ID2', 'ID3'} },
    { key1 = 11, key2 = 'name_2', Mac = {'ID1'}               },
    { key1 = 12, key2 = 'name_4', Mac = {'ID2', 'ID3'}        }
}

function filter(inputTable, ...) 
    filterItems = {...}
end

I want to pass any number of keys to filter this table

local key1 = 11
local Mac = 'ID1'
filter(SampleTable, key1, Mac)
 -- Should return -> { key1 = 11, key2 = 'name_2', Mac = 'ID1'},


key1 = 12
Mac = 'ID3'
filter(SampleTable, key1, Mac)
-- Should return -> { key1 = 12, key2 = 'name_4', Mac = ID3'}

key1 = 11
Mac = 'ID2'
filter(SampleTable, key1, Mac)
 -- Should return both    
-- { key1 = 11, key2 = 'name_2', Mac = ID2'},
-- { key1 = 11, key2 = 'name_5', Mac = ID2'},

key1 = 10
Str = 'sample_string_2'
filter(SampleTable, key1, Str)
 -- Should return { key1 = 10, key2 = 'name_3', Str = 'sample_string_2'}

My current solution is search through each key,value pair in both tables

function filter(tIn, tFilter) 
    local retain = true
    local exist  = nil
    local tOut = tIn
    local _findInTable = function (t, k, v)
        if(not t[k]) then return true
        elseif(t[k] and t[k] == v) then return true
        else return false end
    end

    for i, t in ipairs (tIn) do
        for k,v in pairs (tFilter) do
            exist = _findInTable(t, k, v)
            retain = retain and exist
        end
        if not retain then tOut[i] = nil end
        retain = true
    end
    return tOut
end

local myTable = filter(SampleTable, {key1 = 11, Mac = 'ID1'})

The problem is I cannot foresee how recursion will help. This piece of code works when I have the following SampleTable, As you can see Mac is not a sub-table for my code.

SampleTable = {
    { key1 = 10, key2 = 'name_1', Str = 'sample_string_1'     },
    { key1 = 10, key2 = 'name_3', Str = 'sample_string_2'     },
    { key1 = 11, key2 = 'name_2', Mac = 'ID1'                 }
    -- { key1 = 11, key2 = 'name_2', Mac = {'ID1', 'ID2', 'ID3'} },
    -- { key1 = 11, key2 = 'name_2', Mac = {'ID1'}               },
    -- { key1 = 12, key2 = 'name_4', Mac = {'ID2', 'ID3'}        }
}

Solution

  • From your question, it is not entirely clear whether you are dealing with a recursive schema (arbitrarily deep and branched structures), or whether the provided sample is all there is to it (keys are always assigned one to n values, no recursion in the schema). Without a more complex sample, I decided to implement this for the simpler case.


    Here's my solution to your problem, including your sample:

    local sample = {
        { key1 = 10, key2 = 'name_1', Str = 'sample_string_1'     },
        { key1 = 10, key2 = 'name_3', Str = 'sample_string_2'     },
        { key1 = 11, key2 = 'name_2', Mac = {'ID1', 'ID2', 'ID3'} },
        { key1 = 11, key2 = 'name_2', Mac = {'ID1'}               },
        { key1 = 12, key2 = 'name_4', Mac = {'ID2', 'ID3'}        }
    }
    
    --- Check if a row matches the specified key constraints.
    -- @param row The row to check
    -- @param key_constraints The key constraints to apply
    -- @return A boolean result
    local function filter_row(row, key_constraints)
        -- Loop through all constraints
        for k, v in pairs(key_constraints) do
            if v and not row[k] then
                -- The row is missing the key entirely,
                -- definitely not a match
                return false
            end
    
            -- Wrap the key and constraint values in arrays,
            -- if they're not arrays already (so we can loop through them)
            local actual_values = type(row[k]) == "table" and row[k] or {row[k]}
            local required_values = type(v) == "table" and v or {v}
    
            -- Loop through the values we *need* to find
            for i = 1, #required_values do
                local found
                -- Loop through the values actually present
                for j = 1, #actual_values do
                    if actual_values[j] == required_values[i] then
                        -- This object has the required value somewhere in the key,
                        -- no need to look any farther
                        found = true
                        break
                    end
                end
    
                if not found then
                    return false
                end
            end
        end
    
        return true
    end
    
    --- Filter an array, returning entries matching `key_values`.
    -- @param input The array to process
    -- @param key_values A table of keys mapped to their viable values
    -- @return An array of matches
    local function filter(input, key_values)
        local result = {}
    
        for i = 1, #input do
            local row = input[i]
            if filter_row(row, key_values) then
                result[#result + 1] = row
            end
        end
    
        return result
    end
    

    And here's the sample output, with a utility deep_print() function:

    --- Recursively print out a Lua value.
    -- @param value The value to print
    -- @param indent Indentation level (defaults to 0)
    -- @param no_newline If true, won't print a newline at the end
    local function deep_print(value, indent, no_newline)
        indent = indent or 0
    
        if type(value) == "table" then
            print("{")
            for k, v in pairs(value) do
                io.write(string.rep(" ", indent + 2) .. "[")
                deep_print(k, indent + 2, true)
                io.write("] = ")
                deep_print(v, indent + 2, true)
                print(";")
            end
            io.write(string.rep(" ", indent) .. "}")
        elseif type(value) == "string" then
            io.write(("%q"):format(value))
        else
            io.write(tostring(value))
        end
    
        if not no_newline then
            print()
        end
    end
    
    -- The following is a mix of Lua code
    -- and the script's output
    
    deep_print(filter(sample, {key1 = 10}))
    -- outputs
    {
      [1] = {
        ["key1"] = 10;
        ["key2"] = "name_1";
        ["Str"] = "sample_string_1";
      };
      [2] = {
        ["key1"] = 10;
        ["key2"] = "name_3";
        ["Str"] = "sample_string_2";
      };
    }
    
    deep_print(filter(sample, {key2 = "name_4"}))
    -- outputs
    {
      [1] = {
        ["key1"] = 12;
        ["key2"] = "name_4";
        ["Mac"] = {
          [1] = "ID2";
          [2] = "ID3";
        };
      };
    }
    
    deep_print(filter(sample, {Mac = {"ID2", "ID3"}}))
    -- outputs
    {
      [1] = {
        ["key1"] = 11;
        ["key2"] = "name_2";
        ["Mac"] = {
          [1] = "ID1";
          [2] = "ID2";
          [3] = "ID3";
        };
      };
      [2] = {
        ["key1"] = 12;
        ["key2"] = "name_4";
        ["Mac"] = {
          [1] = "ID2";
          [2] = "ID3";
        };
      };
    }
    
    deep_print(filter(sample, {Mac = {"ID2"}}))
    -- also outputs
    {
      [1] = {
        ["key1"] = 11;
        ["key2"] = "name_2";
        ["Mac"] = {
          [1] = "ID1";
          [2] = "ID2";
          [3] = "ID3";
        };
      };
      [2] = {
        ["key1"] = 12;
        ["key2"] = "name_4";
        ["Mac"] = {
          [1] = "ID2";
          [2] = "ID3";
        };
      };
    }
    
    -- Specifying multiple keys works too:
    deep_print(filter(sample, {key2 = "name_3", Str = "sample_string_2"}))
    -- outputs
    {
      [1] = {
        ["key1"] = 10;
        ["key2"] = "name_3";
        ["Str"] = "sample_string_2";
      };
    }
    

    As you can see, filter_row() is non-recursive. It only operates on the row-level keys. This should work if your schema is flat, as seen in the sample. If the data you want to filter is actually more complicated, please provide more examples.

    The operation of key comparison is made simpler by first wrapping the values in arrays (tables). This allows a uniform comparison approach for all possible cases (with a bit of extra overhead).

    This is my very first answer on SO, please edit/comment if there's anything wrong with it. Thanks.