Search code examples
lua

Difference between stateful and stateless iterators in Lua


What is difference between stateless and stateful iterators in Lua? When do we need to use stateless and when the other one? I need examples to understand the concept.


Solution

  • First let's agree on a definition: (in Lua) an iterator is a function-like object that returns the next value in a sequence, every time it is called. I think it helps to rewrite the for iteration as done is Lua ref manual:

    for itemlist in expression do block end
    

    is logically equivalent to (pseudocode):

    do
        local func, seq, controlVar = expression
        while true do
            local itemlist = func(seq, controlVar)
            if first of itemlist == nil then break end
            controlVar = first of itemlist
    
            block (which uses items in itemlist)
       end
    end
    

    where expression is a triplet of data (or a function call that returns such a triplet):

    • func is the actual iterator function
    • seq is the sequence being iterated over
    • controlVar is the loop control variable

    Iteration state is any state that could be needed to find the next item in the sequence being iterated over. A stateless iterator is therefore one where func does not contain any such state: you can call func(seq, controlVar) at any time, the return value will always be the same (if seq has not changed); it does not depend on what happened before the call.

    As seen above, Lua supports one loop control variable. So in order for a sequence to be iterable via a stateless iterator, it must be possible to determine the next item in the sequence based on one loop control variable. I.e., it must be possible to figure out "next s item" from "(s, controlVar)" alone. The ipairs() generates an iterator that does this: ipairs(s) returns the triplet (iterFunction, s, 0); the iterFunction can be given s and the index 0, and returns 1, s[1], then 2, s[2], etc (eventually nothing for a table of N items).

    What if finding the next item in a sequence requires more than one loop control variable? Or depends on the state of other variables, which should be saved during iteration? Example:

    • an endless iterator may need to keep track of "first" item so that once the end of sequence is reached, it can resume at first item;
    • a graph iterator may need to keep track of "most recent sibling" in a depth first search so that once the end of a branch is reached, it can continue with next most recent sibling.

    A stateful iterator holds state about the iteration so that the next item can be found. In Lua this is possible if the iterator function is a closure (a function with upvalues) or a functor (a table that behaves as a function, i.e. has a __call metamethod). The up values (closure) or the data members (functor) could store the required state.

    A stateless iterator can always be wrapped into a stateful iterator. For ipairs:

    function statefulIpairs(s)
        local f, s, var = ipairs(s)
        return function() 
            local i, v = f(s,var)
            var = i
            return i, v
        end
    end
    

    This can then be called as

    tbl = {'a', 'b', 'c', 'd'}
    sip = statefulIpairs(tbl) -- sip is stateful iter specific to tbl
    for i,v in sip() do print(i,v) end
    

    A developer of a stateful iterator decides what capabilities the iterator has: the iterator's API may allow for rewind, inverting the direction, or other operations. This is even possible in the case of closures: additional parameters can be used to access the additional capabilities. Example, accept a third parameter which, when non nil, resets to beginning of sequence:

    function resetableStatefulIpairs(s)
        local f, s, var = ipairs(s)
        local start = var
        return function(a,b,reset)
            if reset ~= nil then var = start; return end        
            local i, v = f(s,var)
            var = i
            return i, v
        end
    end
    
    sip = resetableStatefulIpairs(tbl) -- sip is stateful iter specific to tbl
    for i,v in sip() do print(i,v) end
    sip(nil, nil, true) -- reset it
    for i,v in sip() do print(i,v) end
    

    Update A neater example is how to generate a function iterator that accepts commands such that you can "...stop anywhere in the sequence and iterate over the rest of the sequence 3 times" (as requested by @deduplicator):

    function iterGen(seq, start)
        local cvar = start or 1
        return function(cmd) 
            if cmd == nil then
                if cvar > #seq then return nil, nil end
                val = seq[cvar]
                cvar = cvar + 1
                return cvar-1, val
    
            else
                cmd = cmd[1]
                if cmd == 'rewind' then
                    cvar = start or 1
    
                elseif cmd == 'newstart' then
                    start = cvar
                end
            end
        end
    end
    

    With the above:

    > s = {1,2,3,4,5,6,7}
    > iter = iterGen(s)
    > for i,v in iter do print(i,v); if i==3 then break end  end
    1       1
    2       2
    3       3
    > iter {'newstart'} -- save current as the new start pos
    > for i,v in iter do print(i,v)  end -- continue till end
    4       4
    5       5
    6       6
    7       7
    > iter {'rewind'}
    > for i,v in iter do print(i,v)  end
    4       4
    5       5
    6       6
    7       7
    > iter {'rewind'}
    > for i,v in iter do print(i,v)  end
    4       4
    5       5
    6       6
    7       7
    

    As demonstrated, there is nothing special about stateful iterators except for the fact that the state of iteration is inside the iterator, so as stated, it is up to the developer to expose desired functionality like rewind and newstart above. With stateless iterators, there are no limits.

    It would be a more natural API to design the iterator as a functor, since then the iterator "function" has "methods" which can be called, but it was an interesting challenge to create a commandable function.