Is the hyphen encoded in LDAP DNs?

I am using the AntiXss nuget package v4.3.0 to encode strings used in LDAP connections and queries. I am finding something that I don't understand: if I call

Microsoft.Security.Application.Encoder.LdapDistinguishedNameEncode("test-name")

I get the output

test#2Dname

while everywhere I search (ex here, here) or even in the RFC standard (as much as I can understand) it always says that the hyphen is NOT a character to escape.

Is there something I'm not getting or is this a bug of the library?

One of the RDNs in my LDAP tree has a hyphen in it ("CN=John Doe,DC=test-name,DC=net"), so this is a situation I have to handle.

That library doesn't seem to be much mantained nowadays, so it could be a real PITA.

Solution

Having a little look through the IL for this package, I can see that it does indeed encode a hyphen character (char 45).

In fact, the following characters between 32 and 126 (inclusive) will all be escaped by LdapDistinguishedNameEncode:

33 = !
34 = "
38 = &
39 = '
43 = +
44 = ,
45 = -
59 = ;
60 = <
61 = =
62 = >
92 = \
124 = |

Why?

In the library, a series of characters are declared as 'safe', that do not require escaping. For some reason, the above characters have been explicitly excluded in LdapEncoder:

private static IEnumerable DistinguishedNameSafeList()
{
    for (int i = 32; i <= 126; i++)
        if (i != 44 && i != 43 && i != 34 && i != 92 && i != 60 && i != 62 && i != 38 && i != 33 && i != 124 && i != 61 && i != 45 && i != 39 && i != 59)
            yield return (object)i;
}

What to do?

Presuming you don't want to reimplement the code yourself that's in the library, I'd suggest that you could do a nasty bit of string replacement to correct this:

Microsoft.Security.Application.Encoder.LdapDistinguishedNameEncode("test-name").Replace("#2D", "-");

It feels a bit hacky, but if you want to retain the hyphen I don't see what other choice you have sadly.

RFC 4514

The RFC explicitly states characters that are escaped, such as:

a space (' ' U+0020) or number sign ('#' U+0023) occurring at the beginning of the string;

a space (' ' U+0020) character occurring at the end of the string;

one of the characters '"', '+', ',', ';', '<', '>', or '\' (U+0022, U+002B, U+002C, U+003B, U+003C, U+003E, or U+005C, respectively);

the null (U+0000) character.

However, it goes on to say:

Other characters may be escaped.

That somewhat vague statement would indicate that you could potentially expect escaping of any character.