I am using read_html()
pandas function to read a html table and then finally convert it to excel using ExcelWriter
and to_excel
. But as my table has a index column so this is what I get when I use read_html()
:
data = pd.read_html(url)
Output:
[ Unnamed: 0 1 3
0 0 3 5
1 1 5 6
2 2 7 2
3 3 4 4
4 4 5 6
5 5 6 7
6 6 4 8
7 7 7 7
8 8 8 8
9 9 9 9]
And when I do
writer = pd.ExcelWriter('example1.xlsx')
data[0].to_excel(writer,sheet_name= 'Sheet1', index=False)
I get an index unnamed column in my excel files. I have also used index = False
and drop
function but it give an error as Can't drop None
.
I believe if you need to remove column 0
and index use:
data[0].drop(0, axis=1).to_excel(writer,sheet_name= 'Sheet1', index=False)
For check columns names if possible convert it to list
:
print (data[0].columns.tolist())