My data consists of 4 columns. They are Name, Age, Seq_1, Seq_2.
For a row (an instance) in my data, its Name and Age are in string and integer, respectively. Seq_1 is an array of numbers. It can be float or integer. Different rows can have Seq_1 with different lengths. Same applies for Seq_2.
I have stored the whole table in DataFame, as df
.
> type(df), type(df.Seq_1), type(df.loc[0].Seq_1)
(pandas.core.frame.DataFrame, pandas.core.series.Series, list)
I try to write/ read df
from/to csv using
df = pd.read_csv("../data/df.csv", low_memory=False)
df.to_csv("../data/df.csv", index=False)
The problem is:
After I write/ read df
, the nature of list of numbers of df.loc[0].Seq_1
is lost.
Before write/ read, I can access the entry in df.loc[0].Seq_1
by df.loc[0].Seq_1[0]
.
After write/ read, it seems df.loc[0].Seq_1
is casted into string. df.loc[0].Seq_1[0]
returns the first character.
For example,
before write/ read, df.loc[0].Seq_1[0]
returns -2
.
after write/ read, df.loc[0].Seq_1[0]
returns -
.
Thanks in advance.
This is a duplicate of this question. I use ast.literal_eval
given in the second answer.
Pandas DataFrame stored list as string: How to convert back to list
df.Seq_1 = df.Seq_1.apply(literal_eval)