Search code examples
pythonsessionflaskwtformsflask-wtforms

Flask-WTF: CSRF token missing


What seemed like a simple bug - a form submission that won't go through due to a "CSRF token missing" error - has turned into a day of hair pulling. I have gone through every SO article related to Flask or Flask-WTF and missing CSRF tokens, and nothing seems to be helping.

Here are the details:

Following Martijin's guidelines to an earlier question:

The Flask-WTF CSRF infrastructure rejects a token if:

1) the token is missing. Not the case here, you can see the token in the form.

The token is definitely present in my form, and being POST'ed successfully

2) it is too old (default expiration is set to 3600 seconds, or an hour). Set the TIME_LIMIT attribute on forms to override this. Probably not the case here.

Also OK for me - the token is well within the default expiration time

3) if no 'csrf_token' key is found in the current session. You can apparently see the session token, so that's out too.

In my case, session['csrf_token'] is properly set and seen by Flask

4) If the HMAC signature doesn't match; the signature is based on the random value set in the session under the 'csrf_token' key, the server-side secret, and the expiry timestamp in the token.

This is my problem. The HMAC comparison between the submitted form's CSRF and the session CSRF fails. And yet I don't know how to solve it. I've been desperate enough (as with the other questioner) to dig into Flask-WTF code and set debugging messages to find out what's going on. As best I can tell, it's working like this:

1) generate_csrf_token() in "form.py" (Flask-WTF) wants to generates a CSRF token. So it calls:

2) generate_csrf() in "csrf.py". That function generates a new session['csrf_token'] if one does not exist. In my case, this always happens - although other session variables appear to persist between requests, my debugging shows that I never have a 'csrf_token' in my session at the start of a request. Is this normal?

3) The generated token is returned and presumably incorporated into the form variable when I render hidden fields on the template. (again, debugging shows that this token is present in the form and properly submitted and received)

4) Next, the form is submitted.

5) Now, validate_csrf in csrf.py is called. But since another request has taken place, and generate_csrf() has generated a new session CSRF token, the two timestamps for the two tokens (in session and from the form) will not match. And since the CSRF is made up in part by expiration dates, therefore validation fails.

I suspect the problem is in step #2, where a new token is being generated for every request. But I have no clue why other variables in my session are persisting from request to request, but not "csrf_token".

There is no weirdness going on with SECRET_KEY or WTF_CSRF_SECRET_KEY either (they are properly set).

Anyone have any ideas?


Solution

  • I figured it out. It appears to be a cookie/session limit (which probably beyond Flask's control) and a silent discarding of session variables when the limit is hit (which seems more like a bug).

    Here's an example:

    templates/hello.html

    <p>{{ message|safe }}</p>
    <form name="loginform" method="POST">
      {{ form.hidden_tag() }}
      {{ form.submit_button() }}
    </form>
    

    myapp.py

    from flask import Flask, make_response, render_template, session
    from flask_restful import Resource, Api
    from flask_wtf import csrf, Form
    from wtforms import SubmitField
    
    app = Flask(__name__)
    app.secret_key = '5accdb11b2c10a78d7c92c5fa102ea77fcd50c2058b00f6e'
    api = Api(app)
    
    num_elements_to_generate = 500
    
    class HelloForm(Form):
        submit_button = SubmitField('Submit This Form')
    
    class Hello(Resource):
        def check_session(self):
            if session.get('big'):
                message = "session['big'] contains {} elements<br>".format(len(session['big']))
            else:
                message = "There is no session['big'] set<br>"
            message += "session['secret'] is {}<br>".format(session.get('secret'))
            message += "session['csrf_token'] is {}<br>".format(session.get('csrf_token'))
            return message
    
        def get(self):
            myform = HelloForm()
            session['big'] = list(range(num_elements_to_generate))
            session['secret'] = "A secret phrase!"
            csrf.generate_csrf()
            message = self.check_session()
            return make_response(render_template("hello.html", message=message, form=myform), 200, {'Content-Type': 'text/html'})
    
        def post(self):
            csrf.generate_csrf()
            message = self.check_session()
            return make_response("<p>This is the POST result page</p>" + message, 200, {'Content-Type': 'text/html'})
    
    api.add_resource(Hello, '/')
    
    if __name__ == '__main__':
        app.run(debug=True)
    

    Run this with num_elements_to_generate set to 500 and you'll get something like this:

    session['big'] contains 500 elements
    session['secret'] is 'A secret phrase!'
    session['csrf_token'] is a6acb57eb6e62876a9b1e808aa1302d40b44b945
    

    and a "Submit This Form" button. Click the button, and you'll get:

    This is the POST result page
    session['big'] contains 500 elements
    session['secret'] is 'A secret phrase!'
    session['csrf_token'] is a6acb57eb6e62876a9b1e808aa1302d40b44b945
    

    All well and good. But now change num_elements_to_generate to 3000, clear your cookies, rerun the app and access the page. You'll get something like:

    session['big'] contains 3000 elements
    session['secret'] is 'A secret phrase!'
    session['csrf_token'] is 709b239857fd68a4649deb864868897f0dc0a8fd
    

    and a "Submit This Form" button. Click the button, and this time you'll get:

    This is the POST result page
    There is no session['big'] set
    session['secret'] is 'None'
    session['csrf_token'] is 13553dce0fbe938cc958a3653b85f98722525465
    

    3,000 digits stored in the session variable is too much, so the session variables do not persist between requests. Interestingly they DO exist in the session on the first page (no matter how many elements you generate), but they will not survive to the next request. And Flask-WTF, since it does not see a csrf_token in the session when the form is posted, generates a new one. If this was a form validation step, the CSRF validation would fail.

    This seems to be a known Flask (or Werkzeug) bug, with a pull request here. I'm not sure why Flask isn't generating a warning here - unless it is somehow technically unfeasible, it's an unexpected and unpleasant surprise that it is silently failing to keep the session variables when the cookie is too big.