I am trying to convert a pandas dataframe read from a CSV file to a pytorch tensor, but am getting a type error.
I tried doing this:
df = pandas.DataFrame({"spam": [1, 2, 3, 4], "eggs": [5, 6, 7, 8], "ham": [9, 10, 11, 12]})
print(type(df))
t = torch.from_numpy(df.values)
dataframe = pandas.read_csv('dataset.csv')
print(type(dataframe))
tens = torch.from_numpy(dataframe.values)
This works perfectly for df, but throws a type error for dataframe
TypeError: can't convert np.ndarray of type numpy.object_. The only supported types are: float64, float32, float16, complex64, complex128, int64, int32, int16, int8, uint8, and bool.
Both the types are exactly the same
<class 'pandas.core.frame.DataFrame'>
What could be going wrong?
This issue usually arises when your DataFrame contains non-numeric or mixed types. The .values
attribute returns a NumPy array, but PyTorch expects a specific type.
print(dataframe.dtypes)
. Make sure all are numeric.dataframe = dataframe.astype(float)
or selectively convert columns.Try something like this:
# For specific columns
dataframe['some_column'] = dataframe['some_column'].astype(float)
# For all columns
dataframe = dataframe.astype(float)
# Then convert to tensor
tens = torch.from_numpy(dataframe.values)
Make sure the conversion is meaningful for your application.