Search code examples
javaregexparsingactive-directoryldap

Parse ldap filter to escape special characters


An ejb service takes a ldap filter as string and returns a result from ActiveDirectory.

The problem is that sometimes attribute values contain special characters that need to be escaped for the entire filter as specified here:
https://msdn.microsoft.com/en-us/library/aa746475(v=vs.85).aspx
and for distinguished name attibute values as specified here:
https://msdn.microsoft.com/en-us/library/aa366101(v=vs.85).aspx
In order to accomplish this the service must do the following:

  1. Analyze the string for dn values, separate them and escape them as per dn escape rules if they are not already escaped.
  2. Search the remainder of the string for special characters in attribute values and escape them as per general filter escape rules if they are not already escaped.
  3. Combine the results as the new escaped filter and pass it on.

Java native javax.naming.ldap.Rdn escapes dn values all right but is not idempotent. As for the other tasks, so far I have been unable to find a library that would allow me to accomplish them.

Right now I am inclined to think that the job of escaping the ldap filter should be done by the user of the service rather than by the service itself as it is very hard for the service to tell escapes from actual values. Also, parsing a complex string such as a ldap filter without a well tested library seems to me error prone.

Any ideas on how to solve this? Can this task be automated at all?


Solution

  • For escaping LDAP filters, I relied on this page to write the code below: http://social.technet.microsoft.com/wiki/contents/articles/5392.active-directory-ldap-syntax-filters.aspx#Special_Characters

    String LdapEscape(String ldap)
    {
        if(ldap == null) return "";
        return ldap.replace("\\", "\\5C").replace("*", "\\2A").replace("(", "\\28").replace(")", "\\29").replace("\000", "\\00");
    }
    

    The most important thing to keep in mind here is that replacing \ with \5C must happen first so that you don't double escape any characters. Otherwise it's very straightforward; there aren't any special tricks to watch out for.

    I'd like to point out that this is meant to escape individual values placed in LDAP filters, not the entire LDAP filter. However if you wanted, you could use that function to escape something like this so it can be searched for:

    LdapEscape("(!(sn=m*))"); // \28!\28sn=m\2A\29