Search code examples
pythonpandascsvsklearn-pandas

Avoid storing list as string in pandas


I have a pandas dataframe, containing a column which has list

df = pd.DataFrame({'ID': ['ID1', 'ID2'], 
                    'colA': [['AB', 'CD'], ['AB']]})
df

    ID  colA
0   ID1 [AB, CD]
1   ID2 [AB]

When I am saving in dataframe, it converts list with multiple values into a string, and leave single value ones with quotes

ID, colA
ID1, "['AB', 'CD']"
ID2,['AB']

second row was not stored as string because it contains only single value in the list.

I am facing problem when I am reading this csv again because it then converts the data to this:

    ID  colA
0   ID1 ['AB', 'CD']
1   ID2 ['AB']

How can I avoid this? I want to read my data as this

    ID  colA
0   ID1 [AB, CD]
1   ID2 [AB]

Solution

  • to_json

    df.to_json('my.json')
    pd.read_json('my.json')
    
        ID      colA
    0  ID1  [AB, CD]
    1  ID2      [AB]