I am trying to use an API to store data into a CSV file.
I am querying the API / loading the data using the following:
def load_data(id):
with urlopen('url' + str(id)) as response:
source = response.read()
data = json.loads(source)
return data
Which is retuning a dict like:
{'name': 'Blah',
'address_1':'Street',
'address_2':'Town',
'website':'www.blah.com'}
I am then trying to iterate through a list of target id numbers to retrieve the data like so:
for x in targets:
data = load_data(x)
try:
data = load_data(x)
name = data['name']
address_1 = data['postalAddressLine1']
address_2 = data['postalAddressLine2']
website = data['website']
except KeyError as e:
pass
with open('test.csv', 'w', newline='') as csvfile:
# Declaring the writer
data_writer = csv.writer(csvfile, quoting=csv.QUOTE_ALL)
# Writing the headers
data_writer.writerow(['name', 'address_1', 'address_2', 'website'])
# Writing the data
data_writer.writerow([name, address_1, address_2, website])
The problem I am having is that a data point is missing on some of the iterations, e.g. on loop 2 there is no website, which is causing KeyError
and therefore crashing the code - so I added in the try
and except
to catch this.
But now it seems that I am only returning data for the ids which have all of the above data points.
What I would like to do is return all of the data possible and ignore/fill in blank values where there is a KeyError
So I am wondering is my logic set up correctly and how can I achieve the above goal?
Please let me know if this is not worded very well!
Edit My code wasn't writing each row of data as I had the writer in the wrong part of the loop. Updated code with the write structure and Roland Smith's answer to handle missing value.
empty_value = 'TBC'
with open('test.csv', 'w', newline='') as csvfile:
# Declaring the writer
data_writer = csv.writer(csvfile, quoting=csv.QUOTE_ALL)
# Writing the headers
data_writer.writerow(['name', 'address_1', 'address_2', 'website'])
for x in targets:
data = load_data(x)
try:
name = data.get('name', empty_value)
address_1 = data.get('postalAddressLine1', empty_value)
address_2 = data.get('postalAddressLine2', empty_value)
website = data.get('website', empty_value)
# Writing the data
data_writer.writerow([name, address_1, address_2, website])
except KeyError as e:
print(e)
pass
What I would suggest is to add missing keys manually:
required = ('name', 'address_1', 'address_2', 'website')
data = load_data(x)
for key in required:
if key not in data:
data[key] = 'not available'
Now your data
at least contains all the keys you expect.
Alternatively, you could use the default
argument of the get
method:
ds = 'not available'
name = data.get('name', default=ds)
address_1 = data.get('address_1', default=ds)
address_2 = data.get('address_2', default=ds)
website = data.get('website', default=ds)