Let's say I have directory in which I have filenames with the names 'filename_1'
, 'filename_2'
, etc. and have a generator models_paths
which I use to find the newest number:
mypath = 'my/path/filename'
models_paths = Path(mypath).parent.glob(Path(mypath).name + '*')
number_newest = max(int(str(file_path).split('_')[-1]) for file_path in models_paths)
I would like to know whether max
is building a list-like data structure or whether it is using an algorithm like
number_newest = None
for file_path in models_paths:
number_current = int(str(file_path).split('_')[-1])
number_newest = number_current if number_newest is None else max(number_current, number_newest)
In other words: Am I losing processing efficiency and/or memory efficiency if I write
mypath = 'my/path/filename'
models_paths = Path(mypath).parent.glob(Path(mypath).name + '*')
models_paths = list(models_paths)
number_newest = max(int(str(file_path).split('_')[-1]) for file_path in models_paths)
?
max
does not build a list.
This can be demonstrated clearly in this example with a custom object:
class Thing:
def __init__(self, x):
self.x = x
print(f'creating {x}')
def __lt__(self, other):
return self.x < other.x
def __del__(self):
print(f'destroying {self.x}')
def __str__(self):
return f'<{self.x}>'
print(max(Thing(i) for i in range(5)))
which gives:
creating 0
creating 1
destroying 0
creating 2
destroying 1
creating 3
destroying 2
creating 4
destroying 3
<4>
destroying 4
As you can see, the __del__
method is being called on each of the objects as soon as it is decided that it is no longer the object with the maximum value. This would not be the case if they were being appended to a list.
Contrast with:
print(max([Thing(i) for i in range(5)]))
which gives:
creating 0
creating 1
creating 2
creating 3
creating 4
destroying 3
destroying 2
destroying 1
destroying 0
<4>
destroying 4
You can write a (less efficient) equivalent function and show that it does the same thing:
def mymax(things):
empty = True
for thing in things:
if empty or (thing > maximum): # parentheses for clarity only
maximum = thing
empty = False
if empty:
raise ValueError
return maximum