Search code examples
htmlfile-uploadescapingbottle

Why are uri chars (or at least spaces) being dropped on an html file upload?


I have a file upload form and would like to use the filename on the server, however I notice that when I upload it the spaces are dropped. On the client/browser I can do something like this in an event called after the input type='file' element has changed:

function process_svg (e) {
    var files = e.target.files || e.originalEvent.dataTransfer.files;
    console.log(files[0].filename);

And if I upload a file with the name 'some file - type.ext' 'some file - type.ext' will be printed in the console. On the server (running bottle) however if I run:

@route('/some_route')
def some_route():
    print(request.files['form_name_attr'].filename)

I get 'somefile-type.ext.' I am guessing this has to do with uri escaping (or lack there of), but since you cannot change a file preupload how do you get around this and preserve it? Strangely I cannot find mention of this on google, in part I have had trouble thinking of appropriate search terms, but I'm also aware that this may not actually be native behaviour, but a bug elsewhere in my code.

I do not think that is the case as I've issued these console.log and print statements at the end (right before the upload) and beginning (right when the server starts processing the request) and do not believe I really have any code to touch it in between, however if that is the case please let me know as I could be looking in the wrong direction.


Solution

  • You want raw_filename, not filename.

    (Note that it may contain unsafe characters.)

    @route('/some_route', method='POST')
    def some_route():
        print(request.files['form_name_attr'].filename)  # "cleaned" file name
        print(request.files['form_name_attr'].raw_filename)  # unmodified file name
    

    Found this in the source code for FileUpload.filename:

    Only ASCII letters, digits, dashes, underscores and dots are allowed in the final filename. Accents are removed, if possible. Whitespace is replaced by a single dash. Leading or tailing dots or dashes are removed. The filename is limited to 255 characters.