What are some common XSS vectors for websites aside from unsanitized input from text fields finding there way back into pages? Trying to prevent malicious access to csrf tokens in cookies. I'm escaping unsafe characters from text inputs (probably will end up adding that in Java servlets as well before database inserts or printing to UI). Where else should I be looking for XSS entering the site?
If I understand the question correctly, you mitigated some forms of reflected and stored XSS, by encoding user input from input fields on the UI.
You should be aware of a few things:
- Not all user inputs are through UI input fields. Cookies and request headers are also examples of user input, and of course hidden fields or json/xml/any other type of parameters too. If your application processes any files or receives outside requests besides http, those too are user input. Even fields of your database are best treated as user input and encoded upon writing them to the page, especially if other components also write to the database.
- Maybe it is already the case in your application, but to make this answer more comprehensive: XSS is more of an output issue, regardless of where user input comes from, the solution is output encoding most of the times (and not input validation / sanitization by itself, especially not with blacklists). There may be careful exceptions to this though, and of course input validation does indeed nicely complement proper output encoding.
- The encoding method should be selected according to the context where the data is written (ie. you need different encoding when writing into a javascript block or plain html; also note that javascript blocks are not only inside script tags, but for example inside event attributes like onclick and others).
- DOM XSS is entirely on the client, and must be mitigated there, in Javascript. See the relevant OWASP guides below.
The general OWASP XSS page is very useful.
They also have a few guides: