Search code examples
elasticsearchsearchindexing

How to query indexed documents? What I have in title, content, I can't retrieve it with a query. Does it mean that my index isn't "refined" enough?


I'm new with Elastic querying.
I have inserted few ids under an apprentissage index.

if I do a match_all, I can see for example:

GET apprentissage/_search
{
 "query": {
    "match_all": {
    }
  }
}
[...]
{
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 11,
      "relation": "eq"
    },
    "max_score": 1,
    "hits": [
      {
        "_index": "apprentissage",
        "_id": "s020-la_comparaison_echantillons",
        "_score": 1,
        "_ignored": [
          "attachment.content.keyword",
          "data.keyword",
          "attachment.keywords.keyword"
        ],
        "_source": {
          "data": [...],

          "attachment": {
            "date": "2023-09-27T03:14:14Z",
            "keywords": "comparaison, échantillon, comparaison d’échantillons, comparaison de moyenne d’échantillon, Student, Fisher, risque d’erreur alpha, degré de liberté, table de Student, intervalle de confiance d’une différence de moyennes, appariement, échantillons appareillés, moyenne des différences, dispersion des différences, analyse de la variance à un facteur, dispersion des moyennes, variance iter-groupe, variance intra-groupe, Loi de Fisher, Loi de Student",
            "content_type": "application/pdf",
            "author": "Marc Le Bihan",
            "format": "application/pdf; version=1.5",
            "modified": "2023-09-27T03:14:14Z",
            "language": "fr",
            "title": "s020. La comparaison d’échantillons",
            "creator_tool": "LaTeX via pandoc",
            "content": """s020. La comparaison d’échantillons

Marc Le Bihan

La comparaison d’échantillons met en oeuvre les tests statistiques.
[...]

But when I attempt to query the word comparaison, with:

GET apprentissage/_search
{
 "query": {
    "query_string": {
      "fields": [ "content", "title" ],
      "query": "comparaison"
    }
  }
}

I'm receiving no matches:

{
  "took": 0,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 0,
      "relation": "eq"
    },
    "max_score": null,
    "hits": []
  }
}

why? Isn't my apprentissage index not yet searchable for text?

I've created it by pdf ingestion.


Solution

  • You need to reference the full path of your field, as shown below:

    GET apprentissage/_search
    {
     "query": {
        "query_string": {
          "fields": [ "attachment.content", "attachment.title" ],
          "query": "comparaison"
        }
      }
    }