I am trying to import a csv format file. this is tick trading data info. The file is as follows:
0,2017-09-18 02:00:06,12568.00,1,201,12567.00,12568.00,5462,0,0,C,
0,2017-09-18 02:00:06,12568.50,2,203,12567.00,12568.00,5463,0,0,C,
0,2017-09-18 02:00:06,12569.00,1,204,12567.00,12569.00,5468,0,0,C,
0,2017-09-18 02:00:06,12569.00,1,205,12567.00,12569.00,5470,0,0,C,
0,2017-09-18 02:00:06,12569.50,3,208,12567.00,12569.00,5471,0,0,C,
I am using this python code:
import pandas as pd
df = pd.read_csv("XG#/20170918.txt", names=['empty', 'date time', 'last', 'last size', 'bid', 'ask'])
print(df.head(1))
my output is this:
empty date time last \ 0 2017-09-18 02:00:06 12567.0 200.0 200.0 12567.0 12567.0 5430.0 0.0 last size bid ask 0 2017-09-18 02:00:06 12567.0 200.0 200.0 12567.0 0.0 C NaN
Process finished with exit code 0
My questions are:
df.drop(df.index[0])
nothing happens.Any help is welcome!
There are 10 columns and you have names for 6 columns, so this how the code should look like:
df = pd.read_csv('lol.csv',usecols = list(range(0,6)),names=['empty', 'date_time', 'last', 'last_size', 'bid', 'ask'])
i used the first 6 columns, please feel to understand the below example and name your desired columns.
usecols is where you put a list of your column numbers which you want it to be named.
for eg : if you want col 1,3,4 to be named as name,gender,address then the code will look like
pd.read_csv('lol.csv',usecols = [1,3,4],names=['name','gender','address'])
for the third question
df = pd.read_csv('lol.csv',usecols = list(range(0,6)),names=['empty','date_time', 'last', 'last_size', 'bid', 'ask'],index_col = 'date_time' )
you can use the index_col parameter to tell which column to use as index.
to drop a column after you import an csv in variable (for eg: df ) using pandas, use the following code:
df.drop('empty', axis=1, inplace=True)