Search code examples
pythonpandaslistdataframeseries

How to separate values from pandas series into dictionary?


I have a pandas series like this:

LIST  0     ITEM1
1           Element1
2           Element2
3           Element3           
4           Element4
5           Element5
6           Element6
7           Element7
8           ITEM2
9           Element8
10          ELEMENT9
11          ELEMENT10
12          Element11
13          Element12      
14          Element13
15          Element14
16          Element2
17          Element24
18          Element25
19          Element26
20          ITEM3
21          Element28
Name: Items, dtype: object

I would like to separate the Items from the element objects. In the real example, the elements aren't all called 'Elements' and the same happens for 'Items', so I cannot connect the code to the naming (as in contains 'elements' and contains 'items'). I would need to access the values by dictionary keys or by dataframe columns. For example:

df['ITEMS1'] should give the first elements: Element1 to Element7.
or dict['ITEMS'] should be connected to the first 7 elements as well.

How can I separate the elements from the items?


Solution

  • You can use dict comprehension:

    print ({i.iloc[0]: i.iloc[1:].tolist() for _, i in df.groupby(df["Items"].str.startswith("ITEM").cumsum())["Items"]})
    
    {'ITEM1': ['Element1', 'Element2', 'Element3', 'Element4', 'Element5', 'Element6', 'Element7'],
     'ITEM2': ['Element8', 'ELEMENT9', 'ELEMENT10', 'Element11', 'Element12', 'Element13', 'Element14',
               'Element2', 'Element24', 'Element25', 'Element26'],
     'ITEM3': ['Element28']}