How to defend against xss when saving data and when displaying it

Let's say I have a simple CRUD application with a form to add new object and edit an existing one. From a security point of view I want to defend against cross-site scripting. Fist I would validate the input of submitted data on the server. But after that, I would escape the values being displayed in the view because maybe I have more than one application writing in my database (some developer by mistake inserts unvalidated data in the DB in the future). So I will have this jsp:

<%@ taglib prefix="esapi" uri="http://www.owasp.org/index.php/Category:OWASP_Enterprise_Security_API" %>
<form ...>
   <input name="myField" value="<esapi:encodeForHTMLAttribute>${myField}</esapi:encodeForHTMLAttribute>" />
</form>

<esapi:encodeForHTMLAttribute> does almost the same thing as <c:out>, it HTML escapes sensitive characters like < > " etc

Now, if I load an object that somehow was saved in the database with myfield=abc<def the input will display correctly the value abc<def while the value in the html behind will be abc<def.

The problem is when the user submits this form without changing the values, the server receives the value abc<def instead of what is visible in the page abc<def. So this is not correct. How should I implement the protection in this case?

Solution

The problem is when the user submits this form without changing the values, the server receives the value abc<def instead of what is visible in the page abc

Easy. In this case HTML decode the value, and then validate.

Though as noted in a few comments, you should see how we operate with the OWASP ESAPI-Java project. By default we always canonicalize the data which means we run a series of decoders to detect multiple/mixed encoding as well as to create a string safe to validate against with regex.

For the part that really guarantees you protection however, you normally want to have raw text stored on the server--not anything that contains HTML characters, so you may wish to store the unescaped string, if only that you can safely encode it when you send it back to the user.

Encoding is the best protection for XSS, and I would in fact recommend it BEFORE input validation if for some reason you had to choose.

I say may because in general I think its a bad practice to store altered data. It can make troubleshooting a chore. This can be even more complicated if you're using a technology like TinyMCE, a rich-text editor in the browser. It also renders html so its like dealing with a browser within a browser.