Search code examples
elasticsearchserializationjson.netnest

How to serialize type information using custom serializer also for sub documents using NEST and Elasticsearch


I'm using the example on https://www.elastic.co/guide/en/elasticsearch/client/net-api/current/custom-serialization.html#_serializing_type_information to get $type information for the documents in elasticsearch.

However as mentioned on the page this only returns type information for the outer document:

the type information is serialized for the outer MyDocument instance, but not for each MySubDocument instance in the SubDocuments collection.

So my question now is if anyone knows how to also get type information for sub documents?

I've tried using the same JsonSerializerSettings as in their example separate from Elasticsearch (using LinqPad) and there I get type information also for sub documents:

void Main()
{
    var temp = new ListBlock
    {
        Id = 1,
        Title = "Titel",
        Blocks = new List<BlockContent> {
            new InnerBlock {
                Id = 11,
                MyProperty ="Inner Property"
            },
            new InnerBlock2 {
                Id = 12,
                MyProperty2 = "Inner property 2"
            }
        }
    };

    var serializeOptions = new Newtonsoft.Json.JsonSerializerSettings
    {
        TypeNameHandling = Newtonsoft.Json.TypeNameHandling.All,
        NullValueHandling = Newtonsoft.Json.NullValueHandling.Ignore,
        TypeNameAssemblyFormatHandling = Newtonsoft.Json.TypeNameAssemblyFormatHandling.Simple,
        Formatting = Newtonsoft.Json.Formatting.Indented
    };
    var serialized = Newtonsoft.Json.JsonConvert.SerializeObject(temp, serializeOptions);
    serialized.Dump();
}

public class BlockContent
{
    public int Id { get; set; }
}

public class ListBlock : BlockContent
{
    public string Title { get; set; }
    public List<BlockContent> Blocks { get; set; }
}

public class ListBlock2 : BlockContent
{
    public string Title2 { get; set; }
    public List<BlockContent> Blocks { get; set; }
}

public class InnerBlock : BlockContent
{
    public string MyProperty { get; set; }
}

public class InnerBlock2 : BlockContent
{
    public string MyProperty2 { get; set; }
}

This code results in the following json:

{
  "$type": "UserQuery+ListBlock, LINQPadQuery",
  "Title": "Titel",
  "Blocks": {
    "$type": "System.Collections.Generic.List`1[[UserQuery+BlockContent, LINQPadQuery]], System.Private.CoreLib",
    "$values": [
      {
        "$type": "UserQuery+InnerBlock, LINQPadQuery",
        "MyProperty": "Inner Property",
        "Id": 11
      },
      {
        "$type": "UserQuery+InnerBlock2, LINQPadQuery",
        "MyProperty2": "Inner property 2",
        "Id": 12
      }
    ]
  },
  "Id": 1
}

Using these versions at the moment:

  • Elasticsearch 7.4.2
  • Nest 7.4.2

Update:

The solution provided by Russ Cam below works like a charm for the data model included in the response, however I've put together an example below based on how we create indices (using automap) and bulk index the initial list of documents. This works fine if we exclude the list of Guids (CategoryIds) in the model but if we include it the following exception is thrown:

{
    "took": 8,
    "errors": true,
    "items": [{
            "index": {
                "_index": "testindex",
                "_type": "_doc",
                "_id": "1",
                "status": 400,
                "error": {
                    "type": "mapper_parsing_exception",
                    "reason": "failed to parse field [categoryIds] of type [keyword] in document with id '1'. Preview of field's value: '{$values=[], $type=System.Collections.Generic.List`1[[System.Guid, System.Private.CoreLib]], System.Private.CoreLib}'",
                    "caused_by": {
                        "type": "illegal_state_exception",
                        "reason": "Can't get text on a START_OBJECT at 1:140"
                    }
                }
            }
        }, {
            "index": {
                "_index": "testindex",
                "_type": "_doc",
                "_id": "2",
                "status": 400,
                "error": {
                    "type": "mapper_parsing_exception",
                    "reason": "failed to parse field [categoryIds] of type [keyword] in document with id '2'. Preview of field's value: '{$values=[], $type=System.Collections.Generic.List`1[[System.Guid, System.Private.CoreLib]], System.Private.CoreLib}'",
                    "caused_by": {
                        "type": "illegal_state_exception",
                        "reason": "Can't get text on a START_OBJECT at 1:141"
                    }
                }
            }
        }
    ]
}

Here is a simple (.Net 5) console application where this behaviour hopefully can be reproduced by others also:

using System;
using System.Collections.Generic;
using System.Linq;
using Elasticsearch.Net;
using Nest;
using Nest.JsonNetSerializer;
using Newtonsoft.Json;

namespace ElasticsearchTypeSerializer
{
    internal class Program
    {
        private const string IndexName = "testindex";

        private static void Main(string[] args)
        {
            var pool = new SingleNodeConnectionPool(new Uri("http://localhost:9200"));
            var settings = new ConnectionSettings(pool,
                (builtin, settings) => new MySecondCustomJsonNetSerializer(builtin, settings));
            settings.DisableDirectStreaming();
            var client = new ElasticClient(settings);

            CreateIndex(client);
            IndexDocuments(client);
            var documents = GetDocuments(client);
        }

        private static void CreateIndex(IElasticClient client)
        {
            var createIndexResponse = client.Indices.Create(IndexName, x => x.Map<MyDocument>(m => m.AutoMap()));
        }

        private static void IndexDocuments(IElasticClient client)
        {
            var documents = new List<MyDocument>
            {
                new()
                {
                    Id = 1,
                    Name = "My first document",
                    OwnerId = 2,
                    SubDocuments = new List<SubDocument>
                    {
                        new MySubDocument {Id = 11, Name = "my first sub document"},
                        new MySubDocument2 {Id = 12, Description = "my second sub document"}
                    }
                },
                new()
                {
                    Id = 2,
                    Name = "My second document",
                    OwnerId = 3,
                    SubDocuments = new List<SubDocument>
                    {
                        new MySubDocument {Id = 21, Name = "My third sub document"}
                    }
                }
            };

            var bulkIndexResponse = client.Bulk(b => b.Index(IndexName).IndexMany(documents).Refresh(Refresh.True));
        }

        private static IEnumerable<MyDocument> GetDocuments(IElasticClient client)
        {
            var searchResponse = client.Search<MyDocument>(s => s.Index(IndexName).Query(q => q.MatchAll()));
            var documents = searchResponse.Documents.ToList();
            return documents;
        }
    }


    public class MyDocument
    {
        public int Id { get; set; }
        public string Name { get; set; }
        public string FilePath { get; set; }
        public int OwnerId { get; set; }
        public List<Guid> CategoryIds { get; set; } = new();
        public List<SubDocument> SubDocuments { get; set; }
    }

    public class SubDocument
    {
        public int Id { get; set; }
    }

    public class MySubDocument : SubDocument
    {
        public string Name { get; set; }
    }

    public class MySubDocument2 : SubDocument
    {
        public string Description { get; set; }
    }


    public class MySecondCustomJsonNetSerializer : ConnectionSettingsAwareSerializerBase
    {
        public MySecondCustomJsonNetSerializer(IElasticsearchSerializer builtinSerializer,
            IConnectionSettingsValues connectionSettings)
            : base(builtinSerializer, connectionSettings)
        {
        }

        protected override JsonSerializerSettings CreateJsonSerializerSettings()
        {
            return new()
            {
                TypeNameHandling = TypeNameHandling.All,
                NullValueHandling = NullValueHandling.Ignore,
                TypeNameAssemblyFormatHandling = TypeNameAssemblyFormatHandling.Simple
            };
        }
    }
}

Any help regarding this issue is very much appreciated!


Solution

  • If you want type information to be included for typed collections on a document, the derived contract resolver can be omitted, which supresses the type handling for collection item types

    private static void Main()
    {
        var pool = new SingleNodeConnectionPool(new Uri($"http://localhost:9200"));
        var settings = new ConnectionSettings(pool, 
            (builtin, settings) => new MySecondCustomJsonNetSerializer(builtin, settings));
            
        var client = new ElasticClient(settings);
    
        var document = new MyDocument
        {
            Id = 1,
            Name = "My first document",
            OwnerId = 2,
            SubDocuments = new[]
            {
            new MySubDocument { Name = "my first sub document" },
            new MySubDocument { Name = "my second sub document" },
        }
        };
    
        var indexResponse = client.IndexDocument(document);
    
    }
    
    public class MyDocument
    {
        public int Id { get; set; }
    
        public string Name { get; set; }
    
        public string FilePath { get; set; }
    
        public int OwnerId { get; set; }
    
        public IEnumerable<MySubDocument> SubDocuments { get; set; }
    }
    
    public class MySubDocument
    {
        public string Name { get; set; }
    }
    
    public class MySecondCustomJsonNetSerializer : ConnectionSettingsAwareSerializerBase
    {
        public MySecondCustomJsonNetSerializer(IElasticsearchSerializer builtinSerializer, IConnectionSettingsValues connectionSettings)
            : base(builtinSerializer, connectionSettings) { }
    
        protected override JsonSerializerSettings CreateJsonSerializerSettings() =>
            new JsonSerializerSettings
            {
                TypeNameHandling = TypeNameHandling.All,
                NullValueHandling = NullValueHandling.Ignore,
                TypeNameAssemblyFormatHandling = TypeNameAssemblyFormatHandling.Simple
            };
    }