Search code examples
pythonpandasdataframedata-processing

Any way to create a column of tuples from a column of floats in pandas?


I'm given a list of tuples of the following form:

ls = [(14, 6, 1.5), (14, 7, 1.5), (14, 8, 1.5), (14, 9, 1.5), (14, 10, 1.5), (14, 11, 1.5), (14, 12, 1.5), ..., (14, 13, 1.5), (14, 14, 1.5), (14, 15, 1.5)]

There is a pandas DataFrame with one of the columns data['ind'] being integers corresponding to the indices of the above list. Now I would like to create a new column, which contains the tuple corresponding to the index columns' entry for the same row. I'm doing it this way:

data['ls'] = data['ind'].apply(lambda x: ls[x])

But I get a following error:

ValueError: setting an array element with a sequence.

Is there any way around this error? The code works perfectly if the list contains floats or integers instead of tuples...


Solution

  • I would first create a Series from your list of tuples:

    LS = pd.Series(ls)
    

    and then call map:

    data['ls'] = data['ind'].map(LS)
    

    Using a sample of your list:

    ls = [(14, 6, 1.5), (14, 7, 1.5), (14, 8, 1.5), (14, 9, 1.5), (14, 10, 1.5), (14, 11, 1.5), (14, 12, 1.5)]
    

    and this:

    data = pd.DataFrame({'ind':[0,2,3]})
    

    performing the lookup leads to:

    In [10]: LS = pd.Series(ls)
    
    In [11]: LS
    Out[11]: 
    0     (14, 6, 1.5)
    1     (14, 7, 1.5)
    2     (14, 8, 1.5)
    3     (14, 9, 1.5)
    4    (14, 10, 1.5)
    5    (14, 11, 1.5)
    6    (14, 12, 1.5)
    dtype: object
    
    In [12]: data['ls'] = data['ind'].map(LS)
    
    In [13]: data
    Out[13]: 
       ind            ls
    0    0  (14, 6, 1.5)
    1    2  (14, 8, 1.5)
    2    3  (14, 9, 1.5)