Search code examples
pythonpython-3.xpandasdataframepandas-groupby

Extract data from table column and make variables in Python


I have a dataset where I want to make a new variable everytime 'Recording' number changes. I want the new variable to include the 'Duration' data for the specific 'Recording' and the previous data. So for the below table it would be:

Var1 = (3, 3, 3)

Var2 = (3, 3, 3, 4, 6)

Var2 = (3, 3, 3, 4, 6, 4, 3, 1, 4)

And so on. I have several dataset that can have different number of recordings (but always starting from 1) and different number of durations for each recording. Any help is greatly appreciated.

Recording Duration
1 3
1 3
1 3
2 4
2 6
3 4
3 3
3 1
3 4

Solution

  • You can aggregate list with cumualative sum for lists, then convert to tuples and dictionary:

    d = df.groupby('Recording')['Duration'].agg(list).cumsum().apply(tuple).to_dict()
    print (d)
    {1: (3, 3, 3), 2: (3, 3, 3, 4, 6), 3: (3, 3, 3, 4, 6, 4, 3, 1, 4)}
    
    print (d[1])
    print (d[2])
    print (d[3])
    

    Your ouput is possible, but not recommended:

    s = df.groupby('Recording')['Duration'].agg(list).cumsum().apply(tuple)
    
    for k, v in s.items():
        globals()[f'Var{k}'] = v