Search code examples
c#wpfelasticsearchnestfscrawler

ElasticSearch file mapping using fscrawler and searching doc by NEST in C#


i indexed documents that are in a folder "/tmp/es" using fscrawler 2.3-SNAPSHOT. It mapped them as :

{
  "properties" : {
    "attachment" : {
      "type" : "binary",
      "doc_values": false
    },
    "attributes" : {
      "properties" : {
        "group" : {
          "type" : "keyword"
        },
        "owner" : {
          "type" : "keyword"
        }
      }
    },
    "content" : {
      "type" : "text"
    },
    "file" : {
      "properties" : {
        "content_type" : {
          "type" : "keyword"
        },
        "filename" : {
          "type" : "keyword"
        },
        "extension" : {
          "type" : "keyword"
        },
        "filesize" : {
          "type" : "long"
        },
        "indexed_chars" : {
          "type" : "long"
        },
        "indexing_date" : {
          "type" : "date",
          "format" : "dateOptionalTime"
        },
        "last_modified" : {
          "type" : "date",
          "format" : "dateOptionalTime"
        },
        "checksum": {
          "type": "keyword"
        },
        "url" : {
          "type" : "keyword",
          "index" : true
        }
      }
    },
    "object" : {
      "type" : "object"
    },
    "meta" : {
      "properties" : {
        "author" : {
          "type" : "text"
        },
        "date" : {
          "type" : "date",
          "format" : "dateOptionalTime"
        },
        "keywords" : {
          "type" : "text"
        },
        "title" : {
          "type" : "text"
        },
        "language" : {
          "type" : "keyword"
        }
      }
    },
    "path" : {
      "properties" : {
        "encoded" : {
          "type" : "keyword"
        },
        "real" : {
          "type" : "keyword",
          "fields": {
            "tree": {
              "type" : "text",
              "analyzer": "fscrawler_path",
              "fielddata": true
            }
          }
        },
        "root" : {
          "type" : "keyword"
        },
        "virtual" : {
          "type" : "keyword",
          "fields": {
            "tree": {
              "type" : "text",
              "analyzer": "fscrawler_path",
              "fielddata": true
            }
          }
        }
      }
    }
  }
}

Now, i want to search them using NEST in my C# application, i was able to get content by hit.source.content but cannot get filename by hit.source.filename...

code :

 var response = elasticClient.Search<documents>(s => s
                .Index("tanks")
                .Type("doc")
                .Query(q => q.QueryString(qs => qs.Query(query))));

            if (rtxSearchResult.Text != " ")
            {
                rtxSearchResult.Text = " ";

                foreach (var hit in response.Hits)
                {


                    rtxSearchResult.Text = rtxSearchResult.Text + ("Name: " + hit.Source.fileName.ToString()
                    + Environment.NewLine
                    + "Content: " + hit.Source.content.ToString()
                    + Environment.NewLine
                    + "URL: " + hit.Source.url.ToString()
                    + Environment.NewLine
                    + Environment.NewLine);
                }
            }

the above throws NULLException but runs when i comment line with hit.Source.url and hit.Source.filename.

Kibana shows the filename field as file.filename and url as file.url and content as content.

As filename is nested under file, i am unable to retrieve it...please help stuck here for couple of days now..


Solution

  • Found the error:

    My documents class was:

    Class documents
    {
          Public string filename { get; set; }
    
          Public string content { get; set; }
    
          Public string url { get; set; }
    }
    

    As filename and url were as file.filename and file.url, we needed another class file with filename and url.

    Class documents
    {
          Public File file { get; set; }
    
          Public string content { get; set; }
    
    }
    
    Class File
    {
              Public string filename { get; set; }
    
              Public string url { get; set; }
    }
    

    And therefore i was able to access them by hit.Source.file.filename and hit.Source.file.url.