I would like a Python object that can flexibly take any key and I can access by key, like a dictionary, but is immutable. One option could be to flexibly generate a namedtuple
but is it bad practice to do this? In the example below a linter would not expect nt
to have attribute a
for example.
Example:
from collections import namedtuple
def foo(bar):
MyNamedTuple = namedtuple("MyNamedTuple", [k for k in bar.keys()])
d = {k: v for k, v in bar.items()}
return MyNamedTuple(**d)
>>> nt = foo({"a": 1, "b": 2})
I mentioned it in the comments, that I'm not sure why this is needed.
But one could simply override __setitem__
of a dictionary class. Alltho this might (most likely) cause problems down the line. A minimal example of this would be:
class autodict(dict):
def __init__(self, *args, **kwargs):
super(autodict, self).__init__(*args, **kwargs)
def __getitem__(self, key):
val = dict.__getitem__(self, key)
return val
def __setitem__(self, key, val):
pass
x = autodict({'a' : 1, 'b' : 2})
x['c'] = 3
print(x)
Which will produce {'a': 1, 'b': 2}
and thus ignoring the x['c'] = 3
set.
The speed difference is some where between 40-1000 times faster using dictionary inheritance compared to named tuples. (See below for crude speed tests)
The in
operator works on dictionaries, not so well on named tuples when used like this:
'a' in nt == False
'a' in x == True
You can use key access dictionary style instead of (for lack of a better term) JavaScript style
x['a'] == nt.a
Although that's a matter of taste.
You also don't have to be picky about keys, since dictionaries support essentially any key identifier:
x[1] = 'a number'
nt = foo({1 : 'a number'})
Named tuples will result in Type names and field names must be valid identifiers: '1'
Now, this is a crude example, and it would vary a lot depending on the system, the place of the moon in the sky etc.. But as a crude example:
import time
from collections import namedtuple
class autodict(dict):
def __init__(self, *args, **kwargs):
super(autodict, self).__init__(*args, **kwargs)
#self.update(*args, **kwargs)
def __getitem__(self, key):
val = dict.__getitem__(self, key)
return val
def __setitem__(self, key, val):
pass
def __type__(self, *args, **kwargs):
return dict
def foo(bar):
MyNamedTuple = namedtuple("MyNamedTuple", [k for k in bar.keys()])
d = {k: v for k, v in bar.items()}
return MyNamedTuple(**d)
start = time.time()
for i in range(1000000):
nt = foo({'x'+str(i) : i})
end = time.time()
print('Named tuples:', end - start,'seconds.')
start = time.time()
for i in range(1000000):
x = autodict({'x'+str(i) : i})
end = time.time()
print('Autodict:', end - start,'seconds.')
Results in:
Named tuples: 59.21987843513489 seconds.
Autodict: 1.4844810962677002 seconds.
The dictionary setup is in my book, insanely quicker. Although that most likely has to do with multiple for
loops in the named tuple setup, and that can probably be easily remedied some how. But for basic understanding this is a big difference. The example obviously doesn't test larger one-time-creations or access times. Just, "what if you use these options to create data-sets over a period of time, how much time would you loose" :)
Bonus: What if you have a large base dictionary, and want to freeze it?
base_dict = {'x'+str(i) : i for i in range(1000000)}
start = time.time()
nt = foo(base_dict)
end = time.time()
print('Named tuples:', end - start,'seconds.')
start = time.time()
x = autodict(base_dict)
end = time.time()
print('Autodict:', end - start,'seconds.')
Well, the difference was bigger than I expected.. x1038.5
times faster.
(I was using the CPU for other stuff, but I think this is fair game)
Named tuples: 154.0662612915039 seconds.
Autodict: 0.1483476161956787 seconds.