Search code examples
c#iisnullreferenceexceptionlucene.net

A weird NullReferenceException from J2N HashSet AddInNotPresent method that is called by Lucene.Net


I post part of the stack trace. This exception doesn't happen regularly and is hard to replicate. But it usually occurs after getting back from its IIS process idle.(Run it first, do nothing for a while, then access the page that has something to do with Lucene.Net search). I'm not very experienced of IIS stuff. Will be very helpful if someone can give me a hand to address this issue~. Thank you

Exception: System.NullReferenceException: Object reference not set to an instance of an object.
   at J2N.Collections.Generic.HashSet`1.AddIfNotPresent(T value)
   at Lucene.Net.Index.IndexReader.SubscribeToGetCacheKeysEvent(GetCacheKeysEvent getCacheKeysEvent)
   at Lucene.Net.Search.FieldCacheImpl.Cache`2.Get(AtomicReader reader, TKey key, Boolean setDocsWithField)
   at Lucene.Net.Search.FieldCacheImpl.GetInt64s(AtomicReader reader, String field, IInt64Parser parser, Boolean setDocsWithField)
   at Lucene.Net.Search.FieldComparer.Int64Comparer.SetNextReader(AtomicReaderContext context)
   at Lucene.Net.Search.TopFieldCollector.OneComparerNonScoringCollector.SetNextReader(AtomicReaderContext context)
   at Lucene.Net.Search.IndexSearcher.Search(IList`1 leaves, Weight weight, ICollector collector)

The code calling lucene api is:

var sectionResults = await Task.WhenAll(sectionsData
  .Select(data => lucene.ExploreQuery(data.Sorting, data.PageTypeFilter, null, data.CategoryId, null, null, null, null, data.Keyword))
  .Select(f => lucene.RetrieveResults(1, 4, f.Sort, null, f.Filter)));

I doubt if I should increase the application pool idle timeout which can reduce its occurrence frequency. https://catchsoftware.com/knowledge-base/application-pool-timeouts/

I check AddIfNotPresent method, but have no idea which reference maybe null during runtime. Even I know where it is, I cannot change it for its being a library.

private bool AddIfNotPresent(T value)
{
    if (_buckets == null)
    {
        Initialize(0);
    }

    int hashCode;
    int bucket;
    int collisionCount = 0;
    Slot[] slots = _slots;

    IEqualityComparer<T>? comparer = _comparer;

    if (comparer == null)
    {
        hashCode = value == null ? 0 : InternalGetHashCode(value.GetHashCode());
        bucket = hashCode % _buckets!.Length;

        if (default(T)! != null) // TODO-NULLABLE: default(T) == null warning (https://github.com/dotnet/roslyn/issues/34757)
        {
            for (int i = _buckets[bucket] - 1; i >= 0; i = slots[i].next)
            {
                if (slots[i].hashCode == hashCode && EqualityComparer<T>.Default.Equals(slots[i].value, value))
                {
                    return false;
                }

                if (collisionCount >= slots.Length)
                {
                // The chain of entries forms a loop, which means a concurrent update has happened.
                    throw new InvalidOperationException(SR.InvalidOperation_ConcurrentOperationsNotSupported);
                }
                collisionCount++;
            }
        }
        else
        {
        // Object type: Shared Generic, EqualityComparer<TValue>.Default won't devirtualize
        // https://github.com/dotnet/coreclr/issues/17273
        // So cache in a local rather than get EqualityComparer per loop iteration
            IEqualityComparer<T> defaultComparer = EqualityComparer<T>.Default;

            for (int i = _buckets[bucket] - 1; i >= 0; i = slots[i].next)
            {
                if (slots[i].hashCode == hashCode && defaultComparer.Equals(slots[i].value, value))
                {
                    return false;
                }

                if (collisionCount >= slots.Length)
                {
                // The chain of entries forms a loop, which means a concurrent update has happened.
                    throw new InvalidOperationException(SR.InvalidOperation_ConcurrentOperationsNotSupported);
                }
                collisionCount++;
            }
        }
    }
    else
    {
        hashCode = value == null ? 0 : InternalGetHashCode(comparer.GetHashCode(value));
        bucket = hashCode % _buckets!.Length;

        for (int i = _buckets[bucket] - 1; i >= 0; i = slots[i].next)
        {
            if (slots[i].hashCode == hashCode && comparer.Equals(slots[i].value, value))
            {
                return false;
            }

            if (collisionCount >= slots.Length)
            {
                // The chain of entries forms a loop, which means a concurrent update has happened.
                throw new InvalidOperationException(SR.InvalidOperation_ConcurrentOperationsNotSupported);
            }
            collisionCount++;
        }
    }

    int index;
    if (_freeList >= 0)
    {
        index = _freeList;
        _freeList = slots[index].next;
    }
    else
    {
        if (_lastIndex == slots.Length)
        {
            IncreaseCapacity();
            // this will change during resize
            slots = _slots;
            bucket = hashCode % _buckets.Length;
        }
        index = _lastIndex;
        _lastIndex++;
    }
    slots[index].hashCode = hashCode;
    slots[index].value = value;
    slots[index].next = _buckets[bucket] - 1;
    _buckets[bucket] = index + 1;
    _count++;
    _version++;

    return true;
}

Update: Tim Schmelter's advice is very helpful. The following code in Lucene.Net shows the thread-unsafe instance variable getCacheKeysEvents can cause exception in a multi-thread context.

 [ExcludeFromRamUsageEstimation]
private readonly ISet<WeakEvents.GetCacheKeysEvent> getCacheKeysEvents = new JCG.HashSet<WeakEvents.GetCacheKeysEvent>();
internal void SubscribeToGetParentReadersEvent(WeakEvents.GetParentReadersEvent getParentReadersEvent)
{
    if (getParentReadersEvent is null)
        throw new ArgumentNullException(nameof(getParentReadersEvent));
    if (getParentReadersEvents.Add(getParentReadersEvent))
        getParentReadersEvent.Subscribe(OnGetParentReaders);
}

internal void SubscribeToGetCacheKeysEvent(WeakEvents.GetCacheKeysEvent getCacheKeysEvent)
{
    if (getCacheKeysEvent is null)
        throw new ArgumentNullException(nameof(getCacheKeysEvent));
    if (getCacheKeysEvents.Add(getCacheKeysEvent))
        getCacheKeysEvent.Subscribe(OnGetCacheKeys);
} 

Solution

  • Thank @TimSchmelter again for your constructive advice. I finally solve this issue by correcting the code calling lucene engine from asynchronous way to synchronous way. The new release has already run more than 1 weeks, the issue doesn't come back. J2N HashSet is used by Lucene.Net internally, so that We cannot touch it unless revising the source code and compile out a dedicated release which is a bit heavy-cost.

    var searchTasks = sectionsData.Select(data => lucene.ExploreQuery(data.Sorting, data.PageTypeFilter, null, data.CategoryId, null, null, null, null, data.Keyword))
      .Select(f => lucene.RetrieveResults(1, 4, f.Sort, null, f.Filter));
    var sectionResults = new J2NGeneric.List<ServiceResponse>();
    foreach (var task in searchTasks)
    {
      sectionResults.Add(await task);
    }