I am trying to query a dataframe for an average of a column, and I converted a parquet file to pandas to do this. I'm getting the error TypeError('Could not convert %s to numeric' % str(x)) which seems to refer to the word "Age" in the column.
The dataframe looks like this:
_c0 _c1 _c2
0 RecId Class Age
1 1 1st 29
2 2 1st 2
3 3 1st 30
My code is:
import pyarrow
import pandas
import pyarrow.parquet as pq
df = pq.read_table("file.parquet").to_pandas()
average_age = df["_c2"].mean()
I tried using
df = df(skiprows=1)
but that gives the error "TypeError: 'DataFrame' object is not callable"
How can I either skip over the row with "Age" in it or remove it, and is this related to it being read from a parquet file or is this a straight up Pandas issue?
You can just use pandas index to remove the first row:
df = df.iloc[1:,:]