I have a dictionary variable "d" with key ,an integer, and value as a list of strings.
368501900 ['GH131.hmm ', 'CBM1.hmm ']
368499531 ['AA8.hmm ']
368500556 ['AA7.hmm ']
368500559 ['GT2.hmm ']
368507728 ['GH16.hmm ']
368496466 ['AA2.hmm ']
368504803 ['GT21.hmm ']
368503093 ['GT1.hmm ', 'GT4.hmm ']
The code is like this:
d = dict()
for key in d:
dictValue = d[key]
dictMerged = list(sorted(set(dictValue), key=dictValue.index))
print (key, dictMerged)
However, I want to remove string after the numbers in the lists so I can have a result like this:
368501900 ['GH', 'CBM']
368499531 ['AA']
368500556 ['AA']
368500559 ['GT']
368507728 ['GH']
368496466 ['AA']
368504803 ['GT']
368503093 ['GT']
I think the code should be inserted between dictValue and dictMerged, but I cannot make a logic. Please, any ideas?
import this at the beginning
import re
now use this line between dictValue and dictMerged
new_dict_value = [re.sub(r'\d.*', '', x) for x in dictValue]
and then use new_dict_value in the next line