I'm trying to import market data from a csv to run some backtests.
I wrote the following code:
import pandas as pd
import numpy as np
df = pd.read_csv("30mindata.csv")
df = df.drop(columns=['Volume', 'NumberOfTrades', 'BidVolume', 'AskVolume'])
print(df)
I'm getting the error:
KeyError: "['Volume', 'NumberOfTrades', 'BidVolume', 'AskVolume'] not found in axis"
When I remove the line of code containing drop()
the dataframe prints as follows:
Date Time Open High Low Last Volume NumberOfTrades BidVolume AskVolume
0 2018/2/18 14:00:00 2734.50 2741.00 2734.00 2739.75 5304 2787 2299 3005
1 2018/2/18 14:30:00 2739.75 2741.00 2739.25 2740.25 1402 815 648 754
2 2018/2/18 15:00:00 2740.25 2743.50 2739.25 2742.00 4536 2301 2074 2462
3 2018/2/18 15:30:00 2742.25 2744.75 2742.25 2744.00 4102 1826 1949 2153
4 2018/2/18 16:00:00 2744.00 2744.25 2742.25 2742.25 2492 1113 1551 941
... ... ... ... ... ... ... ... ... ... ...
59074 2023/2/17 10:30:00 4076.25 4088.00 4076.00 4086.50 92507 54379 44917 47590
59075 2023/2/17 11:00:00 4086.50 4090.50 4079.25 4081.00 107233 67968 55784 51449
59076 2023/2/17 11:30:00 4081.00 4090.50 4079.50 4088.25 171507 92705 86022 85485
59077 2023/2/17 12:00:00 4088.00 4089.00 4085.25 4086.00 41032 17210 21176 19856
59078 2023/2/17 12:30:00 4086.25 4088.00 4085.25 4085.75 5164 2922 2818 2346
I have another file that uses this exact form of pd.read_csv()
and then df.drop(columns=[])
which works just fine. I tried df.loc[:, 'Volume']
and got the same KeyError
saying 'Volume' was not found in the axis
. I really don't understand how the labels aren't in the dataframe when they get output correctly without the .drop()
function
It's very likely that you have blank spaces in the name of your columns.
Try removing those spaces doing this...
import pandas as pd
df = pd.read_csv("30mindata.csv")
df.columns = [col.strip() for col in df.columns]
Then try to drop the columns as before