When we try to rebuild our Lucene (ContentSearch) indexes, our CrawlingLog is filled up with these exceptions:
7052 15:08:21 WARN Crawler : AddRecursive DoItemAdd failed - {5A1E50E4-46B9-42D5-B743-1ED10D15D47E}
Exception: System.AggregateException
Message: One or more errors occurred.
Source: mscorlib
at System.Threading.Tasks.Task.Wait(Int32 millisecondsTimeout, CancellationToken cancellationToken)
at System.Threading.Tasks.Task.Wait()
at System.Threading.Tasks.Parallel.PartitionerForEachWorker[TSource,TLocal](Partitioner`1 source, ParallelOptions parallelOptions, Action`1 simpleBody, Action`2 bodyWithState, Action`3 bodyWithStateAndIndex, Func`4 bodyWithStateAndLocal, Func`5 bodyWithEverything, Func`1 localInit, Action`1 localFinally)
at System.Threading.Tasks.Parallel.ForEachWorker[TSource,TLocal](IEnumerable`1 source, ParallelOptions parallelOptions, Action`1 body, Action`2 bodyWithState, Action`3 bodyWithStateAndIndex, Func`4 bodyWithStateAndLocal, Func`5 bodyWithEverything, Func`1 localInit, Action`1 localFinally)
at System.Threading.Tasks.Parallel.ForEach[TSource](IEnumerable`1 source, ParallelOptions parallelOptions, Action`1 body)
at Sitecore.ContentSearch.AbstractDocumentBuilder`1.AddItemFields()
at Sitecore.ContentSearch.LuceneProvider.CrawlerLuceneIndexOperations.GetIndexData(IIndexable indexable, IProviderUpdateContext context)
at Sitecore.ContentSearch.LuceneProvider.CrawlerLuceneIndexOperations.BuildDataToIndex(IProviderUpdateContext context, IIndexable version)
at Sitecore.ContentSearch.LuceneProvider.CrawlerLuceneIndexOperations.Add(IIndexable indexable, IProviderUpdateContext context, ProviderIndexConfiguration indexConfiguration)
at Sitecore.ContentSearch.SitecoreItemCrawler.DoAdd(IProviderUpdateContext context, SitecoreIndexableItem indexable)
at Sitecore.ContentSearch.HierarchicalDataCrawler`1.CrawlItem(Tuple`3 tuple)
Nested Exception
Exception: System.ArgumentOutOfRangeException
Message: Index and length must refer to a location within the string.
Parameter name: length
Source: mscorlib
at System.String.InternalSubStringWithChecks(Int32 startIndex, Int32 length, Boolean fAlwaysCopy)
at Sitecore.Data.ShortID.Encode(String guid)
at Sitecore.ContentSearch.FieldReaders.MultiListFieldReader.GetFieldValue(IIndexableDataField indexableField)
at Sitecore.ContentSearch.FieldReaders.FieldReaderMap.GetFieldValue(IIndexableDataField field)
at Sitecore.ContentSearch.LuceneProvider.LuceneDocumentBuilder.AddField(IIndexableDataField field)
at System.Threading.Tasks.Parallel.<>c__DisplayClass32`2.<PartitionerForEachWorker>b__30()
at System.Threading.Tasks.Task.InnerInvokeWithArg(Task childTask)
at System.Threading.Tasks.Task.<>c__DisplayClass11.<ExecuteSelfReplicating>b__10(Object param0)
This appears to be caused by the ShortID.Encode(string)
method expecting the GUID in the string parameter to have brackets (" { " and " } ") around it. Some of our multilist field relationships were associated programmatically using Guid.ToString()
, which does not include the brackets. Unfortunately, these values cause the ShortID.Encode()
method to choke.
MultiListField.Add(string)
and change Guid.ToString()
to Guid.ToString("B")
. This will resolve the issue for all new relationships.FieldReader
class to replace the standard MultiListFieldReader
(we called ours CustomMultiListFieldReader
).Sitecore.ContentSearch.FieldReaders.FieldReader
.Sitecore.ContentSearch.FieldReaders.MultiListFieldReader.GetFieldValue(IIndexableDataField)
method into your custom class.Before the if (ID.IsID(id))
line, add the following code:
if (!str.StartsWith("{") && !str.EndsWith("}"))
id = String.Format("{{{0}}}", str);
In your index configuration (we added ours to the default, Sitecore.ContentSearch.DefaultIndexConfiguration.config
) change the fieldReaderType for the MultiList fields to your custom type. (This can be found in your config at sitecore/contentSearch/configuration/defaultIndexConfiguration/fieldReaders/mapFieldByTypeName/fieldReader.)
Full disclosure: I don't love this approach because if the default implementation of the MultiListFieldReader
ever changed, we'd be without those changes. But this allows the items to be included in the index without reformatting all of the GUIDs in every multilist field.