I am trying to download crypto historical data from www.data.binance.vision using python. I try to read the zip files into pandas using pd.read_csv method. This used to work a few months back but now an error pops up saying zipfile.badzipfile: file is not a zip file. I have manually downloaded the data and checked the files. The file is indeed a zip file and contains a CSV file inside it. The url generated is also correct. Kindly guide me how to proceed in the matter.
import pandas as pd
import json, requests
base_url = 'https://data.binance.vision/?prefix=data/spot/monthly/klines/NULSBTC/1w/'
url = f'{base_url}NULSBTC-1w-2023-02.zip'
df = pd.read_csv(url)
print(df)
Error:
zipfile.BadZipFile: File is not a zip file
Update 1: add column names
You are using the base url for browsing not downloading:
# Remove ?prefix= --v
base_url = 'https://data.binance.vision/data/spot/monthly/klines/NULSBTC/1w/'
url = f'{base_url}NULSBTC-1w-2023-02.zip'
cols = ['Open time', 'Open', 'High', 'Low', 'Close', 'Volume', 'Close time',
'Quote asset volume', 'Number of trades', 'Taker buy base asset volume',
'Taker buy quote asset volume', 'Ignore']
df = pd.read_csv(url, header=None, names=cols) # No headers
Output:
>>> df
Open time Open High Low Close Volume Close time Quote asset volume Number of trades Taker buy base asset volume Taker buy quote asset volume Ignore
0 1675641600000 0.000011 0.000015 0.000011 0.000013 7807851.0 1676246399999 100.579287 26961 4032754.0 52.369729 0
1 1676246400000 0.000013 0.000013 0.000011 0.000012 3974078.0 1676851199999 47.905902 17172 1972135.0 23.822658 0
2 1676851200000 0.000012 0.000016 0.000012 0.000012 6897952.0 1677455999999 91.021976 23500 3428533.0 45.894707 0