I would like to read a set of csv
files from URL as dataframes. These files contain a date in their name like YYYYMMDD.csv
in their names. I need to iterate over a set of predefined dates and read the corresponding file into a Python.
Sometimes the file does not exist and an error as follows is thrown:
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 404: Not Found
What I would do in this situation is to add one day to the date like turning 2020-05-01
to 2020-05-02
and in case of throwing the aforementioned error I would add 2 days to the date or at most 3 days until there is a url available without an error.
I would like to know how I can write it in a program maybe with nested try
- except
where if adding 1 day to the date leads to a URL without error the subsequent steps are not executed.
As I don't have the data I will use the following URL as an example:
import pandas as pd
import requests
url = 'http://winterolympicsmedals.com/medals.csv'
s = requests.get(url).content
c = pd.read_csv(s)
Here the file being read is medals.csv
. If you try madels.csv
or modals.csv
you will get the error I am talking about. So I need to know how I can control the errors in 3 steps by replacing the file name until I get the desired dataframe like first we try madels.csv
resulting in an error, then models.csv
also resulting in an error and after that medals.csv
which result in the desired output.
My problem is that sometimes the modification I made to the file also fails in except
so I need to know how I can accommodate a second modification.
No need to do any nested try-except
blocks, all you need is one try-except
and a for
loop.
First, function that tries to read a file (returns content of the file, or None
if the file is not found):
def read_file(fp):
try:
with open(fp, 'r') as f:
text = f.read()
return text
except Exception as e:
print(e)
return None
Then, function that tries to find a file from a predefined date (example input would be '20220514'
). The functions tries to read content of the file with the given date, or dates up to 3 days after it:
from datetime import datetime, timedelta
def read_from_predefined_date(date):
date_format = '%Y%m%d'
date = datetime.strptime(date, date_format)
result = None
for i in range(4):
date_to_read = date + timedelta(days=i)
date_as_string = date_to_read.strftime(date_format)
fp = f'data/{date_as_string}.csv'
result = read_file(fp)
if result:
break
return result
To test, e.g. create a data/20220515.csv
and run following code:
d = '20220514'
result = read_from_predefined_date(d)