I found an unexpected behavior when using Lua for generating a string from a tree-like structure using closures and recursion.
I simplified the code to this, and made a JS version. The JS version works as expected, but I don't understand why or how exactly.
The code is:
function union(children)
local union = {}
function union:method(p)
print("union method", p, #children)
-- rename p2 to p to make it work as expected
function f(p2, children)
print("union f", p, #children)
local result = nil
if #children == 1 then
result = children[1]:method(p)
elseif #children == 2 then
result = children[1]:method(p) .. children[2]:method(p)
else
local first_child = table.remove(children)
result = first_child:method(p) .. f(p, children)
end
return result
end
return f(p, children)
end
return union
end
function transform(children)
local child = children[1]
local transform = {}
function transform:method(p)
print("transform start")
res = child:method(p .. "TRANSFORMED ")
print("transform end")
return res
end
return transform
end
function leaf()
local leaf = {}
function leaf:method(p)
return p
end
return leaf
end
root = union({leaf(), leaf(), transform({union({leaf()})}) })
print(root:method("p"))
The output is: "pTRANSFORMED pTRANSFORMED pTRANSFORMED"
But I expected: "pTRANSFORMED pp"
In summary, I don't understand why the "transformation" node is affecting the three leaves instead of just one.
The JS code is:
function union(children){
const union = {}
union.getField = (p) =>{
function f(p2, children){
console.log("union", p, p2, children.length)
let result = null
if (children.length == 1 ){
result = children[0].getField(p)
}else if ( children.length == 2 ){
result = children[0].getField(p) + children[1].getField(p)
}else {
const first_child = children.pop()
result = first_child.getField(p) + f(p, children)
}
return result
}
return f(p, children)
}
return union
}
function transform(children){
const child = children[0]
const transform = {}
transform.getField = (p)=>{
return child.getField(p + "TRANSFORMED ")
}
return transform
}
function leaf(){
const leaf = {}
leaf.getField = (p) =>{
return p
}
return leaf
}
const root = union([leaf(), leaf(), transform([union([leaf()])]) ])
console.log(root.getField("p"))
The function f
is not lexically scoped to each union:method
.
function foo()
end
is equivalent to
foo = function ()
end
Every a time a union:method
is called, the function f
is redefined in the lexical scope of the chunk.
This creates a closure around the upvalue p
when last redefined.
A couple of print statements bring this to light:
function union(children)
local union = {}
function union:method(p)
print("union method", p, #children)
function f(p2, children)
-- ...
print('f<p>', p)
-- ...
end
print('f redefined', f)
return f(p, children)
end
return union
end
union method p 3
f redefined function: 0x5652063c80c0
f<p> p
union method pTRANSFORMED 1
f redefined function: 0x5652063c8780
f<p> pTRANSFORMED
f<p> pTRANSFORMED
pTRANSFORMED pTRANSFORMED pTRANSFORMED
root:method("p")
causes f
to close around the value of p
as "p"
, but the first child method called is the transformed union,
local first_child = table.remove(children)
result = first_child:method(p) .. f(p, children)
which causes the redefinition of f
to close around its argument p .. "TRANSFORMED"
. The pseudo-recursive call is made right after, executing
result = children[1]:method(p) .. children[2]:method(p)
which the incorrect value of p
.
Changing p2
to p
brings the correct value back into scope via an argument, instead of being ignored for an incorrect upvalue. At this point the function does not close around anything, and could actually be placed completely outside the definition of union
.
Alternatively, lexically scoping each function to each union:method
causes it to create the correct closures,
local function f(_, __)
evidenced by needing no arguments.
union method p 3
f redefined function: 0x55b789c4f0a0
f<p> p
union method pTRANSFORMED 1
f redefined function: 0x55b789c4f7a0
f<p> pTRANSFORMED
f<p> p
pTRANSFORMED pp
In JavaScript, function declarations
function f(p2, children) {
}
are lexically scoped (and hoisted) within their enclosing block.