I'm trying to create hyperlinks in an dataset that is eventually exported using xlswriter using a function I create in my python script. However, the output produced is not as expected.
=HYPERLINK()
in Excel takes two arguments. These are the destination URL and the anchor text. It is this that I am trying to produce in Python using XLSXWriter so that when imported, Excel is able to read column as a link.
See exmaple below.
#create exmaple df
df = pd.DataFrame({'urls': ['https://www.google.com', 'https://www.youtube.com', 'https://www.twitter.com','https://www.reddit.com']})
#create function
def make_hyperlink(x):
for urls in df['urls']:
return '=HYPERLINK(' + urls, "Site Link"
#apply function
df['hyperlink'] = df['urls'].apply(lambda x: make_hyperlink(x))
#view df
df
output:
urls hyperlink
0 https://www.google.com (=HYPERLINK(https://www.google.com, Site Link)
1 https://www.youtube.com (=HYPERLINK(https://www.google.com, Site Link)
2 https://www.twitter.com (=HYPERLINK(https://www.google.com, Site Link)
3 https://www.reddit.com (=HYPERLINK(https://www.google.com, Site Link)
I am expecting the output to display each of the individual URLs with 'Site Link' as the anchor text so I am unsure as to why the function only applies to the first URL but four times.
Guidance appreciated.
There are a couple of issues in your make_hyperlink()
function and how you apply it. Here is a corrected version:
import pandas as pd
#create exmaple df
df = pd.DataFrame({'urls': ['https://www.google.com', 'https://www.youtube.com',
'https://www.twitter.com','https://www.reddit.com']})
#create function
def make_hyperlink(url):
return '=HYPERLINK("%s", "Site Link")' % url
#apply function
df['hyperlink'] = df['urls'].apply(make_hyperlink)
#view df
print(df)
# Create a Pandas Excel writer using XlsxWriter as the engine.
writer = pd.ExcelWriter('pandas.xlsx', engine='xlsxwriter')
# Convert the dataframe to an XlsxWriter Excel object.
df.to_excel(writer, sheet_name='Sheet1')
# Close the Pandas Excel writer and output the Excel file.
writer.close()
Output: