Search code examples
c#active-directoryldapldap-query

Active directory query with wildcards has poor performance


I am writing a method in C# which should query Active Directory and find all users and groups with a display name of the format {displayName} (wildcard search with both leading and trailing wildcard), the method will be used for an autocomplete field.

The problem is the performance of the method I wrote is really poor, attempting to query AD takes anything between 30 seconds and a full minute, depending on the query string.

My organization's AD is very large but if it takes this long the autocomplete field will be pointless.

Here is the code I am using right now:

// Intialize the results list.
result.queryResult = new List<Classses.ADSearchObject>();

// Set up domain context.
PrincipalContext pc = new PrincipalContext(ContextType.Domain, Domain, Constants.adQueryUser, Constants.adQueryPassword);

// Set up a directory searcher.
DirectorySearcher dSearcher = new DirectorySearcher();
// Define a SearchCollection to store the results.
SearchResultsCollection searchCol;
// Define returned result paging for performance.
dSearcher.PageSize = 1000;
// Define the properties to retrieve
dSearcher.PropertiesToLoad.Add("sAMAccountName");
dSearcher.PropertiesToLoad.Add("displayName");
// Define the filter for users.
dSearcher.Filter = $"(|(&(displayName = {result.querystring}*)(objectCategory=person))(&(displayName=*{result.querystring})(objectCategory=person)))";

// Search based in filter and save the results.
searchCol = dSearcher.FindAll();

// Add the results to the returned object 
foreach (SearchResult searchResult in searchCol)
{
   DirectoryEntry de = searchResult.GetDirectoryEntry();
   // Code to get data from the results...
}

// Define the filter for groups.
dSearcher.Filter = $"(|(&(displayName={result.querystring}*)(objectCategory=person))(&(displayName=*{result.querystring})(objectCategory=person)))";

// Search based in filter and save the results.
searchCol = dSearcher.FindAll();

// Add the results to the returned object 
foreach (SearchResult searchResult in searchCol)
{
   DirectoryEntry de = searchResult.GetDirectoryEntry();
   // Code to get data from the results...
}

Currently the search is divided to users and groups to make it easy to distinguish between them but if it increases performance substantially I will unify them to a single search.

Edit: As the user rene suggested, I used a Stopwatch to check the time it takes for FindAll and I also checked how long my foreach loops take.

I found out that the FindAll calls take about 100ms (very fast) even when searching with a leading wildcard (which isn't) indexed by AD.

Apparently the calls that take longest are my foreach loops which take about 40 seconds (40,000ms).

I am updating the question with the code block in my foreach loops as I haven't figured out how to improve its performance:

// --- I started a stopwatch here
foreach (SearchResult searchResult in searchCol)
{
   // --- I stopped the stopwatch here and noticed it takes about 30,000ms
   result.code = 0;

   DirectoryEntry de = searchResult.GetDirectoryEntry();

   ADSearchObject adObj = new ADSearchObject();

   adObj.code = 0;

   if (de.Properties.Contains("displayName")
   {
        adObj.displayName = de.Properties["displayName"].Value.ToString();
   }

    adObj.type = "user";

    result.queryResults.Add(adObj);
}

Note where I started and stopped my 'Stopwatch' in my updated code, I don't know why beginning the loop takes so long.


Solution

  • Of course, a substring match is more costly than an equality match for a unique value. Also it doesn't surprise the lion's share of elapsed time falls into your iterator block, which consumes 40s overall according to your profiling.

    If you are convinced that a huge drop in performance occurs just by setting up an iterator, I'm not - and that's because of your choice of timing points.

    StartClock("foreach");
    foreach (SearchResult searchResult in searchCol)
    {
        // use an empty block to speed things up or
        StopClock("foreach");
        // whatever
        RestartClock("foreach");
    }
    StopClock("foreach");
    LogClock("foreach");
    

    I expect a huge performance gain (for large entry numbers) if you pay heed to a best practice I already commented on: Send a single request to the server recieving all you need in your search result, and don't send another request for each item. While a single call to GetDirectoryEntry() will only consume <1ms, the large number of entries will make your code useless for your application autocompletion feature.

    Kudos to @rene for presenting a normal form for that filter expression. I don't know about filter optimization in Active Directory, so I would take the sure path with

    (&(objectCategory=person)(displayName=*{result.querystring}*))