Search code examples
elasticsearchnest

How to use MultiTermVectorsAsync


I am trying to call the below query using NEST

GET 123_original/_doc/_mtermvectors
{
  "ids": [
    "9a271078-086f-4f4b-8ca0-16376c2f49a7",
    "481ce3db-69bf-4886-9c38-fcb878d44925"
  ],
  "parameters": {
    "fields": ["*"],
    "positions": false,
    "offsets": false,
    "payloads": false,
    "term_statistics": false,
    "field_statistics": false
  }
}

The NEST API (I think) would look something like this

var term = await elasticClient.MultiTermVectorsAsync(x =>
{
    return x.Index(indexOriginal)    // 123_original
        .Type(typeName)              // _doc
        .GetMany<ElasticDataSet>(ids.Keys) // list of strings
        .Fields("*")
        .FieldStatistics(false)
        .Positions(false)
        .Offsets(false)
        .TermStatistics(false)
        .Payloads(false);
});

The problem is that the above API is returning the following error Index name is null for the given type and no default index is set. Map an index name using ConnectionSettings.DefaultMappingFor<TDocument>() or set a default index using ConnectionSettings.DefaultIndex().

And this is the query that its trying to execute which has the index in it and is missing the ids, but works in Kibana when the ids are set. 123_original/_doc/_mtermvectors?fields=%2A&field_statistics=false&positions=false&offsets=false&term_statistics=false&payloads=false

I cannot find a documentation on how to use the Multi Term Vector using NEST.


Solution

  • The Multi Term Vectors API within NEST does not expose the ability to set only Ids, it always assumes that you are passing "docs".

    Even when passing

    client.MultiTermVectors(mt => mt
        .Index("123_original")
        .Type("_doc")
        .GetMany<object>(ids)
        .Fields("*")
        .Positions(false)
        .Offsets(false)
        .Payloads(false)
        .TermStatistics(false)
        .FieldStatistics(false)
    );
    

    The _index and _type for each id is inferred from object in GetMany<T>

    POST http://localhost:9200/123_original/_doc/_mtermvectors?pretty=true&fields=*&positions=false&offsets=false&payloads=false&term_statistics=false&field_statistics=false 
    {
      "docs": [
        {
          "_index": "users",
          "_type": "object",
          "_id": "9a271078-086f-4f4b-8ca0-16376c2f49a7"
        },
        {
          "_index": "users",
          "_type": "object",
          "_id": "481ce3db-69bf-4886-9c38-fcb878d44925"
        }
      ]
    }
    

    I think this could be exposed in a more consumable way within the client in the future.

    The good news is that you can submit the exact query that you would like with the low level client exposed on IElasticClient, and still get back a high level response

    MultiTermVectorsResponse response = 
        client.LowLevel.Mtermvectors<MultiTermVectorsResponse>("123_original", "_doc", PostData.Serializable(new 
        { 
            ids = ids,
            parameters = new
            {
                fields = new[] { "*" },
                positions = false,
                offsets = false,
                payloads = false,
                term_statistics = false,
                field_statistics = false
            }
        }));
    

    which will send the following request:

    POST http://localhost:9200/123_original/_doc/_mtermvectors?pretty=true 
    {
      "ids": [
        "9a271078-086f-4f4b-8ca0-16376c2f49a7",
        "481ce3db-69bf-4886-9c38-fcb878d44925"
      ],
      "parameters": {
        "fields": [
          "*"
        ],
        "positions": false,
        "offsets": false,
        "payloads": false,
        "term_statistics": false,
        "field_statistics": false
      }
    }