Search code examples
lua

What is the fastest way to go through an array / table with numeric indices?


If i have an array with numbered indices in lua and need to go through every entry at least once, is it faster to use a numeric for loop or a generic for loop?


Solution

  • Semantical Difference

    for i = 1, #t do end
    

    is not the same as

    for i, v in ipairs(t) do end
    

    the latter does not rely on #t and respects the __ipairs metamethod (although this is deprecated and the __index and __newindex metamethods should be used instead).

    Assuming no metamethods are set, ipairs will simply loop until it encounters a nil value. It is thus roughly equivalent to the following loop:

    local i, v = 1, t[v]
    while v ~= nil do --[[loop body here]] i = i + 1; v = t[i] end
    

    This means there are two things it doesn't have to do:

    • It does not determine the length. It won't call a __len metamethod if set. This might theoretically result in a better performance for lists which reside in the hash part (where Lua has to determine the length through a search). It could also improve performance in cases where the __len metamethod does costly counting.
    • It does not have to loop over nil values. The numeric for loop on the other hand might loop over arbitrarily many nil values due to how the length operator is defined: For a table {[1] = 1, [1e9] = 1}, both 1 and 1e9 are valid values for the length. This also means it's unclear what it does, as the exact length value is unspecified.

    The latter point in particular means that in pathological cases, the numeric for loop could be arbitrarily slower. It also allows for mistakes, such as looping over (possibly long) strings instead of tables, and won't trigger an error:

    local str = "hello world"
    for i = 1, #str do local v = str[i] end
    

    will loop over only nil values (as it indexes the string metatable) but throw no error.

    I also consider ipairs to be more readable as it makes the intent clear.

    Performance Difference

    For non-pathological cases (lists residing in the list part, no "holes" (nil values), no odd metatables), the numeric for loop can be expected to run marginally faster, as it does not incur the call overhead of the for generic loop you'd be using with ipairs. This ought to be benchmarked on different Lua implementations though:

    • PUC Lua 5.1 to 5.4
    • LuaJIT 2.1.0

    In practice, the costs of looping will often be negligible compared to the cost of the operations performed within the loop body. Results may vary depending on other factors such as operating system or hardware.

    Rudimentary Benchmarks

    print(jit and jit.version or _VERSION)
    
    -- Fill list with 100 million consecutive integer values
    a = {}
    for i = 1, 1e8 do a[i] = i end
    
    local function bench(name, func)
        func() -- "warmup"
        local t = os.clock()
        for _ = 1, 10 do func() end
        print(name, os.clock() - t, "s")
    end
    
    bench("numeric", function()
        for i = 1, #a do
            local v = a[i]
        end
    end)
    
    bench("ipairs", function()
        for i, v in ipairs(a) do end
    end)
    

    Conducted on a Linux machine.

    Lua 5.1
    numeric 54.312082   s
    ipairs  63.579478   s
    Lua 5.2
    numeric 20.482682   s
    ipairs  32.757554   s
    Lua 5.3
    numeric 14.81573    s
    ipairs  23.121844   s
    Lua 5.4
    numeric 11.684143   s
    ipairs  24.455616   s
    

    Finally, LuaJIT:

    LuaJIT 2.1.0-beta3
    numeric 0.567874    s
    ipairs  0.70047 s
    

    Conclusion: Use LuaJIT if possible and stop worrying about micro-optimizations such as ipairs vs. numeric for (even though the latter may be slightly faster). Something as simple as an assert(i == v) will already cost as much as the loop itself (even if assert is local).