Is there a way of assigning a special key to a dictionary that actually does nothing?
I want to do something like:
mydict = {}
key, value = 'foo', 'bar'
mydict[key] = value % now my dict has {'foo': 'bar'}
Now here I want some "special" value of key such that when I run:
mydict[key] = value
It doesn't actually do anything, so mydict is still {'foo': 'bar'} (no extra keys or values added)
I tried using:
d[None] = None # It actually adds {None: None} to the dict
d[] = [] # Invalid syntax
Why I need this:
Well it's basically to handle an initial case.
I have a file which is actually a FASTA format:
>id_3362
TGTCAGTGTTCCCCGTGGCCCTGCGGTTGGAATTGCAGCGGGTCGCTTTAGTTCTGGCAT
ATATTTTGACGGTGCCGGCCGGCGATACTGACGTGTGAGGACTTGAATTTGTACCAGCGC
AACACTTCCAAAGCCTGGACTAGGTTGT
>id_4743
CGGGGGATCTAATGTGGCTGCCACGGGTTGAAAAATGG
>id_5443
ATATTTTGACGGTGCCGGCCGGCGATACTGACGTGTGAGGACTTGAATTTGTACCAGCGC
AACACTTCCAAAGCCTGGACTAGGTTGT
My approach is to read line by line, concatenating the lines into a sequence until the next key is found (line starting with >). Then I save the key (id) with the associated value (sequence) in a dictionary, update the key and start accumulating the next sequence.
Of course I can have a dedicated code (repeated) that handles the first case (which I think it's not a clean approach) or I can have an if
inside the loop that reads each line (which will execute every time)
So the cleanest approach would be every time an id is found, save the previous id with the accumulated seq to the dictionay, but to handle the first line I need some special value for the key.
Here's my code:
def read_fasta(filename):
mydict = {}
id = None # this has to be the special character I'm looking for
seq = ''
with open(filename) as f:
for line in f:
if line[0] == '>':
mydict[id] = seq # save current id and seq
id = line[1:].rstrip('\n') # update id
seq = '' # clean seq
else:
seq += line.rstrip('\n') # accumulate seq
As you can see, in this code the first line will insert the value {None:''} to the dictionary.
I could of course delete this key at the very end, but I'm wondering if I can have an initial value that doesn't insert anything when executed.
Any suggestions?
You could of course do:
id = None
then:
if id is not None: mydict[id] = seq
If you want to avoid insertion without if
testing, you could also use a non-hashable value at start.
id = []
then catch the "unhashable exception". That would work, although ugly, but no extra overhead because the exception is triggered only once.
try:
mydict[id] = seq
except TypeError:
pass
Aside: if speed is your concern then don't use string concatenation
seq += line.rstrip('\n')
is just horribly underperformant. Instead:
seq
as a list
: seq = []
seq
: seq.append(line.rstrip('\n'))
seq = "".join(seq)