Im trying to work with data organized as a lists of lists within each vertical cell of a column, like this. The data type is numeric for all cells in this column, and each sublist has an int and a float.
I am importing the data using read_excel, and would like to be able to access elements of the sublists with indexing eg.
dataframe["ColumnName"][0][0][0]
to get the first item in the first list
However, the contents of the cells are being read as strings (I also checked this with type()). So for example when I try the indexing above I would get "[", the first character of the string, rather than '2'.
I tried:
dtype = {"colname": float)
- It cant (also cannot convert to int)arr.astype(float)
i get "could not convert string to float".When I copy and paste the data into the python file as a bunch of lists, it's fine and I can access the elements with indexing, but obviously I don't want to be doing that. I would like to better understand why it's converting the lists to strings, any advice is much appreciated.
Your Excel elements are strings along the lines of "[[2, 2.3], [2, 3.4]]" so cannot be directly read as Python Lists and subLists from which the numeric values can be extracted and converted. You need to first change the string into a true List by using for example:
from ast import literal_eval
ml = literal_eval(df['ColName'][0])
and then to get the first value of the first subList:
print(ml[0][0])
No type conversion is necessary as the value is an int
.