When a user registers on a site, should we use EncodeForHTML()
or EncodeForURL()
before storing the value in a DB?
The reason I ask this is that when I send an e-mail to someone that includes a URL that contains an email address as a URL variable, I have to use EncodeForURL()
. But if this email address is already encoded using EncodeForHTML()
, it will mean I have to Canonicalize() it before using EncodeForURL()
on it again.
I would therefore think that EncodeForURL()
is probably good, but is it 'safe' and 'correct' when storing the value in a database?
Update: Upon reading the docs it says that EncodeForURL is only for using a value in a URL. Thereofore it seems to make sense that I should store it as EncodedForHTML, but then Canonicalize and re-encode for URL when using it in a URL context. I don't know how much of a performance hit all this encoding is going to take on my server...??
Copying this from my company's internal documentation. Not sure if the images uploaded correctly since imagr is blocked @ work. If so, I'll re-upload them later. I'll be publishing this and more related content to a Githib repo in the future.
You should store it as simple text, but make sure you scrub your data on the way in using an AntiSamy library. Once the data is safe, make sure to encode the data on the way out using the proper encoder. And FYI, there's a big difference between the output of encodeForHTML()
and encodeForHTMLAttribute()
.
In the below examples, substitute the variables that define email addresses with data from the DB.
PROTIP: Don't use these encoders in CFFORM tags. Those tags take care of the encoding for you. CF 9 and below use HTMLEditFormat()
, CF 10 and above most likely use encodeForHTMLAttribute()
.
A basic implementation is to include a single e-mail address in order to populate the "To" field of a new e-mail window.
<cfset email = "someone@example.com" />
<a href="mailto:#email#">E-mail</a>
<a href="mailto:someone@example.com">E-mail</a>
<cfset email = "someone@example.com" />
<a href="mailto:#encodeForURL(email)#">E-mail</a>
Notice that the "@" symbol is properly percent encoded as "%40".
<a href="mailto:someone%40example.com">E-mail</a>
And if you plan on showing the e-mail address on the page as part of the link:
<cfset email = "someone@example.com" />
<a href="mailto:#encodeForURL(email)#">#encodeForHTML(email)#</a>
An advanced implementation includes e-mail addresses for "To" & "CC". It can also pre-populate the body and subject of the new e-mail.
<cfset email = "someone@example.com" />
<cfset email_cc = "someone_else@example.com" />
<cfset subject = "This is the subject" />
<cfset body = "This is the body" />
<a href="mailto:#email#?cc=#email_cc#&subject=#subject#&body=#body#">E-mail</a>
<a href="mailto:someone@example.com?cc=someone_else@example.com&subject=This is the subject&body=This is the body">E-mail</a>
Notice that the subject and body parameters contain spaces. While this string will technically work, it is still prone to attack vectors.
Imagine the value of body is set by the result of a database query. This record has been "infected" by a malicious user and the default body message has an appended "BCC" address, so some evil user can get copies of e-mails sent via this link.
<cfset body = "This is the body&bcc=someone@evil.com" />
<a href="mailto:someone@example.com?cc=someone_else@example.com&subject=This is the subject&body=This is the body&bcc=someone@evil.com">E-mail</a>
Since "href" is an attribute of the <a> tag, you might think to use the HTML Attribute encoder. This would be incorrect.
<cfset email = "someone@example.com" />
<cfset email_cc = "someone_else@example.com" />
<cfset subject = "This is the subject" />
<cfset body = "This is the body&bcc=someone@evil.com" />
<a href="mailto:#encodeForHTMLAttribute(email)#?cc=#encodeForHTMLAttribute(email_cc)#&subject=#encodeForHTMLAttribute(subject)#&body=#encodeForHTMLAttribute(body)#">E-mail</a>
<a href="mailto:someone@example.com?cc=someone_else@example.com&subject=This is the subject&body=This is the body&bcc=someone@evil.com">E-mail</a>
The correct encoding of a MAILTO
link is done with the URL encoder.
<cfset email = "someone@example.com" />
<cfset email_cc = "someone_else@example.com" />
<cfset subject = "This is the subject" />
<cfset body = "This is the body&bcc=someone@evil.com" />
<a href="mailto:#encodeForURL(email)#?cc=#encodeForURL(email_cc)#&subject=#encodeForURL(subject)#&body=#encodeForURL(body)#">E-mail</a>
Notice these things about the URL encoder:
<a href="mailto:someone%40example.com?cc=someone_else%40example.com&subject=This+is+the+subject&body=This+is+the+body%26bcc%3Dsomeone%40evil.com">E-mail</a>