I have a nested list:
"Add changes & things to hot 50 playlist"
"add Madchild to Electro Latino"
"Add artist to my 80'S PARTY"
slot_list = [[['changes', 'entity_name'], ['&', 'entity_name'], ['things', 'entity_name'], ['hot', 'playlist'], ['50', 'playlist']],
[['Madchild', 'artist'], ['Electro', 'playlist'], ['Latino', 'playlist']],
[['artist', 'music_item'], ['my', 'playlist_owner'], ["80'S", 'playlist'], ['PARTY', 'playlist']]]
I want to merge the string in the [0] position together when their [1] position (slot) elements are the same. And still keep the same nested structure, since that they belong to the same sentence.
the expected output:
output = [[['entity_name', 'changes & things'], ['playlist', 'hot 50']],
[['artist', 'Madchild'], ['playlist', 'Electro Latino']], [['music_item', 'artist'],
['playlist_owner', 'my'], ['playlist', "80's PARTY"]]]
This is the code I used:
dic = defaultdict(str)
for element in slot_list:
for word, slot in element:
dic[slot] += ' ' + str(word)
print([[word, slot] for word, slot in dic.items()])
and I got:
[['entity_name', ' changes & things'], ['playlist', " hot 50 Electro Latino 80'S PARTY"], ['artist', ' Madchild'], ['music_item', ' artist'], ['playlist_owner', ' my']]
, which combine the words with same slot together because of the key-value pair in dict. I also tried groupby but it also does not work out.
Hope someone can give me some guidance! Thanks!
Some denomination:
A pair is a list containing two string elements: the
first one (value) is the value represented by the second
one (key), so the ['changes', 'entity_name']
pair represents
a entity name of value "changes", and the ['hot', 'playlist']
pair represents a playlist of value "hot".
A slot is a list of pairs.
Assuming their [1] position are sorted and a slot is
[
['changes', 'entity_name'],
['&', 'entity_name'],
['things', 'entity_name'],
['hot', 'playlist'],
['50', 'playlist'],
]
you can group the slot using each pair's second element
# itertools.groupby(slot, key=lambda x: x[1])
[
['entity_name', [
['changes', 'entity_name'],
['&', 'entity_name'],
['things', 'entity_name'],
],
['playlist', [
['hot', 'playlist'],
['50', 'playlist']
],
]
For each grouped pairs, join all the first elements using a space:
import itertools
def group_slots(slots):
# For each slot in the list of slots, group it
return [group_slot(slot) for slot in slots]
def group_slot(slot):
return [[key, ' '.join(pair[0] for pair in pairs)]
for key, pairs in itertools.groupby(slot, key=lambda x: x[1])]
Then
slots = [
[
['changes', 'entity_name'],
['&', 'entity_name'],
['things', 'entity_name'],
['hot', 'playlist'],
['50', 'playlist'],
],
[
['Madchild', 'artist'],
['Electro', 'playlist'],
['Latino', 'playlist'],
],
[
['artist', 'music_item'],
['my', 'playlist_owner'],
["80'S", 'playlist'],
['PARTY', 'playlist'],
],
]
result = group_slots(slots)
print(result)
outputs
[
[
['entity_name', 'changes & things'],
['playlist', 'hot 50'],
],
[
['artist', 'Madchild'],
['playlist', 'Electro Latino'],
],
[
['music_item', 'artist'],
['playlist_owner', 'my'],
['playlist', "80'S PARTY"],
],
]