I am trying to understand the Deep Q-learning algorithm for standard Cart-pole example using this tutorial, and in def optimize_model() method, I don't understand whether the lambda expression returns a boolean or an index:
non_final_mask = torch.tensor(tuple(map(lambda s: s is not None, batch.next_state)), device=device, dtype=torch.bool)
where batch.next_state
is just a list, and s
is defined only in this line.
Judging from the documentation and this example, lambda s: s is not None
produces a boolean. However, when I simply type in python:
>>> lambda s: s is None
I get
<function <lambda> at 0x100997010>
If I indeed get a boolean from aforementioned lambda expression, how does the map()
method handels it as a first argument?
Thanks for any help in advance.
EDIT: Thank you very much for all your comments and thorough answers! I need a little time to chew it all, but I disagree with [duplicate] mark, as I haven't seen anywhere how boolean function have been applied to a list with map()
. It is still not that clear to me.
map
is a type, but to a first approximation you could assume it is a function defined as
def map(f, xs):
for x in xs:
yield f(x)
To a second approximation, it can take multiple iterable arguments. In this form, it "zips" iterables together using its function argument, rather than just yielding tuples.
def map(f, *args):
for t in zip(*args):
yield f(*t)
In your example, the function is being supplied as a lambda expression, rather than a function defined using a def
statement.
non_final_mask = torch.tensor(tuple(map(lambda s: s is not None, batch.next_state)), device=device, dtype=torch.bool)
is equivalent to
def not_none(s):
return s is not None
non_final_mask = torch.tensor(tuple(map(not_none, batch.next_state)), device=device, dtype=torch.bool)
In either case, map
yields a series of Boolean values created by applying a function to each element yielded by the iterable batch.next_state
; those Booleans are used to create a tuple
that's passed to torch.tensor
as its first argument.
Note that map
was a function in Python 2, and it returned a list.
$ python2
[some macOS blather deleted]
Python 2.7.16 (default, Mar 25 2021, 03:11:28)
[GCC 4.2.1 Compatible Apple LLVM 11.0.3 (clang-1103.0.29.20) (-macos10.15-objc- on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> map(lambda s: s is not None, [1, None, 'c', None])
[True, False, True, False]
In Python 3, it was made a type so that it could return a lazy iterable instead. (The function is only applied as you iterate over the instance of map
, not immediately upon calling map
.) Had no map
function already existed, it's not likely Python 3 would have added it (or at least, it would only be available via the itertools
module, rather than as a built-in type). map
can virtually always be replaced with a generator expression. For example,
non_final_mask = torch.tensor(
tuple(s is not None for s in batch.next_state),
device=device,
dtype=torch.bool)