Given a fixed size n
and a x
dict of input key-value pairs, the goal is iterate through 1...n (1st index), then fetch the values from x if the index exists as x's key, otherwise insert the value -1.
I've tried the following and it kind of work as expected:
n = 10
# Valid keys ranges from [1,10], any positive integer is valid in values.
x = {1:231, 2:341, 5:123}
y = {i+1:x[i+1] if i+1 in x else -1 for i in range(n)}
y
[out]:
{1: 231, 2: 341, 3: -1, 4: -1, 5: 123, 6: -1, 7: -1, 8: -1, 9: -1, 10: -1}
But this seems like a very common pandas or encoding / embedding operation.
Is there a different/simpler way that can take in the sparse key-values from x and directly create y given that we know n without iterating through O(n)
but instead O(len(x))
?
Rationale being, if I've billions of Xs and n is substantially huge e.g. in 1000s then the full O(n) operation is really expensive.
The pandas equivalent would be to reindex
:
y = pd.Series(x).reindex(range(1, n+1), fill_value=-1)
However I believe that trying to assign all values from the beginning is probably not the right approach. Not matter whether python or C-speed, this will be algorithmically expensive if n
is large.
Rather use a defaultdict
or setdefault
to take advantage of "on-demand" creation of key/values in your dictionary:
from collections import defaultdict
y = defaultdict(lambda : -1, x)
y[1]
# 231
y[4]
# -1
Or maybe:
n = 100
i = 50
if i < n:
x.setdefault(i, -1)
else:
raise ValueError(f'key {i} should be < {n}')