Search code examples
pythonutf-8urllibweb2py

Downloading a file with utf-8 characters using web2py fails


Many thanks for reading:
I am developing a small app (hobby) to upload automatically files and be published in the web using python in the local machine and web2py in the WEB

Filenames uploaded have greek utf8 encoded characters i.e.

New_Υπολογιστικό_φύλλο_OpenDocument.ods

In the default web2py controller I use the following to:
1: Grasp the filename clicked in the view link and
2: Send the filename to browser:

def myaction():
     import os
     import urllib
     path  =request.folder + 'files_to_transfer' 
      ###### my folder where files are uploated######
     #change cwd to path   
     cwd = os.chdir(path)
     filename = urllib.quote_plus(request.vars['z']) 
     return response.stream(filename)  

In default controller I use as well the following code to prepare the filenames list used by the view :

def index():


import os
import json

path  =request.folder + 'files_to_transfer' 

#change cwd to path

cwd = os.chdir(path)
filenames = []
filenames_excluded = ['httpserver.log', 'server_files.json' ]
for file in os.listdir(u'.'):
    if file not in filenames_excluded:
        filenames.append ((file).encode('utf-8'))
   return dict(message=filenames)

and in the View I use:

    <html>
   <head><meta charset="utf-8" /></head>
   <body>

       <h2>
           {{for x in range(len(message)):}}

           {{=A('click me', callback=URL('myaction', args=['x', 'y'], vars=dict(z=message[x])))}}
           {{=message[x]}}<br />{{pass}}
        </h2>
  </body>
</html>

The links have the following form:

click me New WinRAR archive.rar
click me New Υπολογιστικό φύλλο OpenDocument.ods
click me New_WinRAR_archive.rar etc...

Problem: Only filenames containing english characters are downloaded. Use of urllib.quote_plus does its job and produces and link like:

http://127.0.0.1:8000/webappfiletransfer/default/myaction/x/y?z=New_%CE%A5%CF%80%CE%BF%CE%BB%CE%BF%CE%B3%CE%B9%CF%83%CF%84%CE%B9%CE%BA%CF%8C_%CF%86%CF%8D%CE%BB%CE%BB%CE%BF_OpenDocument.ods

but response.stream returns for a 404 NOT FOUND error. I have tried variations with urlencode but to no avail. Any thoughts or comments are greatly appreciated.

EDIT: I have tried the use of:

import sys
    reload(sys)
    sys.setdefaultencoding('utf-8') 

but with no luck.


Solution

  • I finally managed to solve the problem by giving more attention in the 404 HTTP error. Although the file existed, it could not be found.

    By viewing the web2py error tickets, I found out that

    response.stream(filename)

    was trying to locate filename:

    New_%CE%A5%CF%80%CE%BF%CE%BB%CE%BF%CE%B3%CE%B9%CF%83%CF%84%CE%B9%CE%BA%CF%8C_%CF%86%CF%8D%CE%BB%CE%BB%CE%BF_OpenDocument.ods

    and not

    New_Υπολογιστικό_φύλλο_OpenDocument.ods

    So I revised the code as below and added two response.headers to allow for file download as well:

    import os
    import sys
    reload(sys)
    sys.setdefaultencoding('utf-8')     
    path=request.folder + 'files_to_transfer'
    cwd = os.chdir(path)
    ##################################################################
    filename =((request.vars['z'])).decode('utf-8') 
    ##########urllib.quote or quote_plus is not needed in my case and ...
    #########trasforms the filename to a form not suitable
    response.headers['ContentType'] ="application/octet-stream";
    response.headers['Content-Disposition']="attachment; filename="+filename
    
    
    return response.stream((filename))
    

    Main problems were:

    1. use of urllib.quote_plus was unneccessary and problematic in this case since it was tranforming the filename as it should but in this case this was a problem

    2. There was the need to decode the filename - already utf-8 encoded- coming from the view in order to be found and so the usage of .decode('utf-8')

    EDIT : Please note that without

    import sys
    reload(sys)
    sys.setdefaultencoding('utf-8') 
    

    I am getting a server error. So probably you will have to use it.

    Hope someone finds it usefull.