I am using the AntiXss nuget package v4.3.0 to encode strings used in LDAP connections and queries. I am finding something that I don't understand: if I call
Microsoft.Security.Application.Encoder.LdapDistinguishedNameEncode("test-name")
I get the output
test#2Dname
while everywhere I search (ex here, here) or even in the RFC standard (as much as I can understand) it always says that the hyphen is NOT a character to escape.
Is there something I'm not getting or is this a bug of the library?
One of the RDNs in my LDAP tree has a hyphen in it ("CN=John Doe,DC=test-name,DC=net"), so this is a situation I have to handle.
That library doesn't seem to be much mantained nowadays, so it could be a real PITA.
Having a little look through the IL for this package, I can see that it does indeed encode a hyphen character (char 45).
In fact, the following characters between 32 and 126 (inclusive) will all be escaped by LdapDistinguishedNameEncode
:
33 = !
34 = "
38 = &
39 = '
43 = +
44 = ,
45 = -
59 = ;
60 = <
61 = =
62 = >
92 = \
124 = |
Why?
In the library, a series of characters are declared as 'safe', that do not require escaping. For some reason, the above characters have been explicitly excluded in LdapEncoder
:
private static IEnumerable DistinguishedNameSafeList()
{
for (int i = 32; i <= 126; i++)
if (i != 44 && i != 43 && i != 34 && i != 92 && i != 60 && i != 62 && i != 38 && i != 33 && i != 124 && i != 61 && i != 45 && i != 39 && i != 59)
yield return (object)i;
}
What to do?
Presuming you don't want to reimplement the code yourself that's in the library, I'd suggest that you could do a nasty bit of string replacement to correct this:
Microsoft.Security.Application.Encoder.LdapDistinguishedNameEncode("test-name").Replace("#2D", "-");
It feels a bit hacky, but if you want to retain the hyphen I don't see what other choice you have sadly.
RFC 4514
The RFC explicitly states characters that are escaped, such as:
- a space (' ' U+0020) or number sign ('#' U+0023) occurring at the beginning of the string;
- a space (' ' U+0020) character occurring at the end of the string;
- one of the characters '"', '+', ',', ';', '<', '>', or '\' (U+0022, U+002B, U+002C, U+003B, U+003C, U+003E, or U+005C, respectively);
- the null (U+0000) character.
However, it goes on to say:
Other characters may be escaped.
That somewhat vague statement would indicate that you could potentially expect escaping of any character.