I have a list of dictionaries like this:
myList = [
{
'id':1,
'text':['I like cheese.',
'I love cheese.', 'oh Ilikecheese !'],
'text_2': [('david',
'david',
'I do not like cheese.'),
('david',
'david',
'cheese is good.')]
},
{
'id':2,
'text':['I like strawberry.', 'I love strawberry'],
'text_2':[('alice',
'alice',
'strawberry is good.'),
('alice',
'alice',
' strawberry is so so.')]
}
]
I want to delete the words that longer than a certain number of letters (e.g. 9 letters).
The ideal output is the same list of dictionaries but delete the misspelled words such as removing "Ilikecheese":
myList = [
{
'id':1,
'text':['I like cheese.',
'I love cheese.', 'oh!'],
'text_2': [('david',
'david',
'I do not like cheese.'),
('david',
'david',
'cheese is good.')]
},
{
'id':2,
'text':['I like strawberry.', 'I love strawberry'],
'text_2':[('alice',
'alice',
'strawberry is good.'),
('alice',
'alice',
' strawberry is so so.')]
}
]
Any suggestions?
Remove each words in a string which is longer or equal than 9. Criterium for splitting a string: single white-space.
myList = # above
for d in myList:
for k, v in d.items():
if isinstance(v, list):
for i, word in enumerate(v):
v[i] = ' '.join(list(filter(lambda w: len(w)<9, word.split(' '))))
for d in myList: print(d)
Output
{'id': 1, 'text': ["I 'll tell you what . Next say ' Potts ' on the tower .", 'I assume . Light her up .', 'Cap , I need the lever !']}
{'id': 2, 'text': ['Dr. Banner .', 'Stark , we need a plan of attack !', '( taken by that )', 'Everyone ! Clear out !', "Think the guy 's a friendly ?", 'Those people need .', 'Then suit up .']}
If tuple
s instead of lists
for d in myList:
for k, v in d.items():
if isinstance(v, tuple):
v = list(v)
for i, word in enumerate(v):
v[i] = ' '.join([w for w in word.split(' ') if len(w) < 9])
d[k] = tuple(v)
for d in myList: print(d)