I am new to python and also to the pandas library in Python. The documentation is not well described and they didn't explain it well. I want to save the dataframe as an excel format and in the memory and I have found the following explanation: [Pandas excel to the memory]
I need an explanation about workbook
. The value of this variable is encoded, how can I see the real value of this variable? How to decode it? what should be return value for it?
EDITED:
How to pass it into the content of attachment in Mandrill api.
https://mandrillapp.com/api/docs/messages.python.html
This is my attachment part for excel extension :
'attachments': [
{
'content': content,
'name': 'fraud_report.xlsx',
'type': 'application/vnd.openxmlformats-officedocument.spreadsheetml.sheet'
}
I am not able to open the excel file and all the time i'am getteing an error from Microsoft excel which said the file format is not valid!...
Any help can be helpful. Thank you
For the explanation i paste the example from your link here again:
# Safe import for either Python 2.x or 3.x
try:
from io import BytesIO
except ImportError:
from cStringIO import StringIO as BytesIO
bio = BytesIO()
# By setting the 'engine' in the ExcelWriter constructor.
writer = ExcelWriter(bio, engine='xlsxwriter')
df.to_excel(writer, sheet_name='Sheet1')
# Save the workbook
writer.save()
# Seek to the beginning and read to copy the workbook to a variable in memory
bio.seek(0)
workbook = bio.read()
The method writer.save()
saves the data in BytesIO (bio
) instead of Excel file. It means, that the variable bio
stores the bytes codes of the excel file.
The method bio.seek(0)
sets the current position (for reading, writing,...) of bio
to 0. So that you can read the data of bio
from beginning with the next method bio.read()
.
The variable workbook
store the bytes string of the excel file (or excel workbook). If you read a excel file in byte mode, you will get the same data. Or you can write it in a excel file:
with open("my_excel_file.xlsx", "wb") as f:
f.write(workbook)
To read data from bio
and store in a DataFrame you don't need bio.read()
:
bio.seek(0)
df = pd.read_excel(bio, "Sheet1", engine="xlrd")
For your question about using of mandrill:
In the example of mandrill you see:
{'attachments': [{'content': 'ZXhhbXBsZSBmaWxl',
'name': 'myfile.txt',
'type': 'text/plain'}],...
The documentation writes about it too:
content: the content of the attachment as a base64-encoded string
You should encode workbook
in base64 and use it for sending
import base64
content = base64.b64encode(workbook)
P/S: workbook
and content
have the type bytes
. May be you need to convert content
to str
before sending.
{'attachments': [{'content': content.decode('utf-8'),
'name': 'myfile.xlsx',
'type': 'text/plain'}],...
Add: if file is excel then you should change type
to application/vnd.openxmlformats-officedocument.spreadsheetml.sheet