In the Flask documentation for file uploads, they recommend use of secure_filename()
to sanitize a file's name before storing it.
Here's their example:
uploaded_file = request.files['file']
if uploaded_file:
filename = secure_filename(uploaded_file.filename) # <<<< note the use of secure_filename() here
file.save(os.path.join(app.config['UPLOAD_FOLDER'], filename))
return redirect(url_for('display_file',
filename=filename))
The documentation says:
Now the problem is that there is that principle called “never trust user input”. This is also true for the filename of an uploaded file. All submitted form data can be forged, and filenames can be dangerous. For the moment just remember: always use that function to secure a filename before storing it directly on the filesystem.
With offsite storage (S3 or Google Cloud), I will not be using Flask to store the file on the web server. Instead, I'll rename the upload file (with my own UUID), and then upload it elsewhere.
Example:
blob = bucket.blob('prompts/{filename}'.format(filename=uuid.uui4()))
blob.upload_from_string(uploaded_file.read(), content_type=uploaded_file.content_type)
Under this scenario, am I right that you do you not need to invoke secure_filename()
first?
It would seem that because I (a) read the contents of the file into a string and then (b) use my own filename, that my use case is not vulnerable to directory traversal or rogue command-type attacks (e.g. "../../../../home/username/.bashrc"
) but I'm not 100% sure.
You are correct.
You only need to use the secure_filename
function if you are using the value of request.files['file'].filename
to build a filepath destined for your filesystem - for example as an argument to os.path.join
.
As you're using a UUID for the filename, the user input value is disregarded anyway.
Even without S3, it would also be safe NOT to use secure_filename
if you used a UUID as the filename segment of the filepath on your local filesystem. For example:
uploaded_file = request.files['file']
if uploaded_file:
file_uuid = uuid.uuid4()
file.save(os.path.join(app.config['UPLOAD_FOLDER'], file_uuid))
# Rest of code
In either scenario you'd then store the UUID somewhere in the database. Whether you store the originally provided request.files['file'].filename
value alongside that is your choice.
This might make sense if you want the user to see the original name of the file when they uploaded it. In that case it's definitey wise to run the value through secure_filename
anyway, so there's never a situation where the frontend displays a listing to a user which includes a file called ../../../../ohdear.txt
the secure_filename
docstring also points out some other functionality:
Pass it a filename and it will return a secure version of it. This filename can then safely be stored on a regular file system and passed to :func:
os.path.join
. The filename returned is an ASCII only string for maximum portability. On windows systems the function also makes sure that the file is not named after one of the special device files.
>>> secure_filename("My cool movie.mov")
'My_cool_movie.mov'
>>> secure_filename("../../../etc/passwd")
'etc_passwd'
>>> secure_filename(u'i contain cool \xfcml\xe4uts.txt')
'i_contain_cool_umlauts.txt'