I have emails and dates. I can use 2 nested for loops to choose emails sent on same date, but how can i do it 'smart way' - efficiently?
# list of tuples - (email,date)
for entry in list_emails_dates:
current_date = entry[1]
for next_entry in list_emails_dates:
if current_date = next_entry[1]
list_one_date_emails.append(next_entry)
I know it can be written in shorter code, but I don't know itertools
, or maybe use map
, xrange
?
You can just convert this to a dictionary, by collecting all emails related to a date into the same key.
To do this, you need to use defaultdict
from collections. It is an easy way to give a new key in a dictionary a default value.
Here we are passing in the function list
, so that each new key in the dictionary will get a list as the default value.
emails = defaultdict(list)
for email,email_date in list_of_tuples:
emails[email].append(email_date)
Now, you have emails['2013-14-07']
which will be a list of emails for that date.
If we don't use a defaultdict
, and do a dictionary comprehension like this:
emails = {x[1]:x[0] for x in list_of_tuples}
You'll have one entry for each date, which will be the last email for that that, since assigning to the same key will override its value. A dictionary is the most efficient way to lookup a value by a key. A list is good if you want to lookup a value by its position in a series of values (assuming you know its position).
If for some reason you are not able to refactor it, you can use this template method, which will create a generator:
def find_by_date(haystack, needle):
for email, email_date in haystack:
if email_date == needle:
yield email
Here is how you would use it:
>>> email_list = [('foo@bar.com','2014-07-01'), ('zoo@foo.com', '2014-07-01'), ('a@b.com', '2014-07-03')]
>>> all_emails = list(find_by_date(email_list, '2014-07-01'))
>>> all_emails
['foo@bar.com', 'zoo@foo.com']
Or, you can do this:
>>> july_first = find_by_date(email_list, '2014-07-01')
>>> next(july_first)
'foo@bar.com'
>>> next(july_first)
'zoo@foo.com'