I am trying to set up a way to match a list of emails and a list of names as a tuple. However, I find that when it reaches the last name, those emails without a name to pair are not included in my tuple, how can I make these extra emails simply pair an empty string ("")?
Essentially, I have excel rows with the format, which I set into a pandas dataframe:
cust_ID | buyer_names | buyer_emails |
---|---|---|
1234 | name 1; name 2; name 3 | email1; email2; email3; email 4 |
..... | ..... | ...... |
I tried this:
# Set regular expression to catch emails
regex = r"[a-zA-Z0-9_.+-]*@[a-zA-Z0-9-]+.[a-zA-Z\.]*"
# Initialise empty list to add query ready emails
emails_query_format = []
# Iterate over retailer_id / emails template rows and append formatted emails to list
for i, row in df.iterrows():
# Put all emails in the row into a list
emails = re.findall(regex, df['additional_emails'][i])
emails = [email.strip() for email in emails]
# Put all additional buyers into a list
buyer_names = row['additional_buyers']
buyers = re.split(r";", buyer_names)
buyers = [buyer.strip() for buyer in buyers]
buyer_email_tuple = [*zip(emails, buyers)]
Eventually, after iterating over this tuple and putting them into the query format, like this:
# For each pair I want to create a row with the formated
for email, buyer in buyer_email_tuple:
# Here I am just putting it into a specific format to copy paste to query template
query_format = "(" + str(row['retailer_id']) + "," + "'" + buyer + "'" + "," + "'" + \
email + "'" + ")" + ","
emails_query_format.append(query_format)
# New DataFrame to input query ready emails
query_df = pd.DataFrame(emails_query_format, columns=['query_ready'])
This way, the tuple does not include the extra 'email4'. Containers in the collections module came up in my mind but I didn't really see a clear way of using a defaultdict for this.
How can I make the tuple include email4 with simply a "" value as name paired to it?
Thanks in advance.
Solved the issue:
for idx in range(len(emails)):
if idx <= len(buyers) -1:
buyer_emails_tuple_list.append((buyers[idx], emails[idx]))
elif idx > len(buyers) -1:
buyer_emails_tuple_list.append(("", emails[idx]))
Now I can make sure that for those emails that have no corresponding buyer names I get them paired with an empty string as:
("", email4)