Search code examples
pythonpandasnanabstract-syntax-tree

Pandas covert nan values to None in string list before literal_eval and convert back to np.nan


I have a dataframe with a few series that contain lists of floats that includes nan values. Eg.

s[0] = '[1.21, 1.21, nan, nan, 100]'

These strings I want to convert to lists using literal_eval. When I try I get the error ValueError: malformed node or string on line 1: because as per the docs, nan values cannot be converted as these values are not recognised.

What is the best way of converting the nan values within the string, to None and then converting back to np.nan values after applying literal_eval?


Solution

  • Solution is like described in a question, but you get Nones instead NaNs:

    s.str.replace('nan', 'None', regex=True).apply(ast.literal_eval)
    

    If you need np.nans use custom function:

    def convert(x):
        out = []
        for y in x.strip('[]').split(', '):
            try:   
               out.append(ast.literal_eval(y))
            except:
               out.append(np.nan)
        return out
    
    s.apply(convert)
    

    Another idea would be to convert all values to floats:

    f = lambda x: [float(y) for y in x.strip('[]').split(', ')]
    s.apply(f)
    

    pd.Series([[float(y) for y in x.strip('[]').split(', ')] for x in s], 
                  index=s.index)