Search code examples
javaelasticsearch

How to add an "analyzer": "french" entry to an Elastic mapping property, using ElasticsearchClient?


I'm creating an Elastic index from a Java object.

public class JeuDeDonnees {
   private String titre;
   private String slug;
   private String acronyme;
   private String url;
   private OrganisationId organisationId;
   private String organisation;
   private String description;
   private String frequence;
   [... Getters and setters ...]
}  

The following unit test succeeds at doing it.
Its method run an co.elastic.clients.elasticsearch.ElasticsearchClient of 8.11.3 version, and defines the appropriate index for the object at insertion time.

After that, it lists the fields it has set in the mapping.

void insertionCatalogue() {
   assertTrue(this.catalogDatagouvElastic.createIndex(), "L'index cible n'a pas été créé");

   // Il faut qu'une entrée ait été placée pour que le mappings soit constitué.
   List<JeuDeDonnees> jeux = this.jeuxDeDonneesDataset.catalogueDataset().limit(100).collectAsList();
   JeuxDeDonnees jeuxDeDonnees = new JeuxDeDonnees(jeux);
   assertDoesNotThrow(() -> this.catalogDatagouvElastic.insert(jeuxDeDonnees), "L'insertion de jeux de données du catalogue dans Elastic a échoué");

   // Relire puis modifier l'index
   TypeMapping mappings = this.catalogDatagouvElastic.getIndex().mappings();

   Set<String> champsEnFrancais = Set.of("titre", "description", "organisation");
   Map<String, Property> properties = mappings.properties();

   properties.forEach((nomChamp, property) -> {
      if (champsEnFrancais.contains(nomChamp)) {
         LOGGER.info("{}: {}", nomChamp, property.toString());
     }
   });
}

The LOGGER.info(...) lists the mappings I would like to change:

test : titre: Property: {"type":"text","fields":{"keyword":{"type":"keyword","ignore_above":256}}}
test : description: Property: {"type":"text","fields":{"keyword":{"type":"keyword","ignore_above":256}}}
test : organisation: Property: {"type":"text","fields":{"keyword":{"type":"keyword","ignore_above":256}}}

I'd like to add to them an "analyzer": "french" entry in their properties,
like I am able to do, If I create my index manually, with the Kibana dev console:

"titre": {
  "type": "text",
  "fields": {
     keyword": {
       "type": "keyword",
       "ignore_above": 256
     }
   },

   "analyzer": "french"
}

but the co.elastic.clients.elasticsearch._types.mapping.Property class
that describes each of my Java object field members
doesn't seem to have a way to add that analyzer declaration.

How may I add it?


Solution

  • Bonjour 👋🏼

    You can't do that on the property itself but you need to define a mapping when you create the index. This can be done by code but also using the APIs or directly from Kibana.

    A direct code example would be like:

    client.indices().create(cir -> cir.index("create-mapping"));
    client.indices().putMapping(pmr -> pmr.index("create-mapping").properties("foo", p -> p.text(tp -> tp)));
    

    Note that some frameworks, like hibernate search or spring data, are providing annotations and might create the mapping for you automatically.

    But the best way to handle this is to create an index template. It's much more flexible.


    Marc Le Bihan edit:
    Glad to your help, I succeeded in adapting my function:

    @Test @DisplayName("Insertion de jeux de données de catalogue")
    void insertionCatalogue() {
       // Créer l'index initial, sans mappings associé
       this.catalogDatagouvElastic.deleteIndex();
       assertTrue(this.catalogDatagouvElastic.createIndex(null), "L'index cible n'a pas été créé");
    
       // Il faut qu'une entrée ait été placée pour que le mappings soit constitué.
       List<JeuDeDonnees> jeux = this.jeuxDeDonneesDataset.catalogueDataset().limit(100).collectAsList();
       JeuxDeDonnees jeuxDeDonnees = new JeuxDeDonnees(jeux);
       assertDoesNotThrow(() -> this.catalogDatagouvElastic.insert(jeuxDeDonnees), "L'insertion de jeux de données du catalogue dans Elastic a échoué");
    
       // Relire puis modifier l'index, en ajoutant "analyzer": "french" à certains champs texte.
       TypeMapping mappings = this.catalogDatagouvElastic.getIndex().mappings();
       assertNotNull(mappings);
    
       Map<String, Property> properties = mappings.properties();
       TypeMapping.Builder typeMappingBuilder = new TypeMapping.Builder();
    
       Set<String> champsEnFrancais = Set.of("titre", "description", "organisation");
    
       for(Map.Entry<String, Property> champ : properties.entrySet()) {
          String nomChamp = champ.getKey();
          Property property = champ.getValue();
    
          typeMappingBuilder.properties(nomChamp,
             champsEnFrancais.contains(nomChamp) ? addAnalyzer(property) : property);
       }
    
       TypeMapping nouveauxMappings = typeMappingBuilder.build();
    
       // Recréer l'index modifié.
       LOGGER.info("Nouveaux mappings {}: ", nouveauxMappings.toString());
       this.catalogDatagouvElastic.deleteIndex();
       assertTrue(this.catalogDatagouvElastic.createIndex(nouveauxMappings), "L'index modifié avec analyzer n'a pas été créé");
    
       // Et réinserer les données
       assertDoesNotThrow(() -> this.catalogDatagouvElastic.insert(jeuxDeDonnees), "La réinsertion de jeux de données du catalogue dans Elastic a échoué");
    }
    
    /**
     * Ajouter un "analyzer": "french" au mapping d'un objet texte.
     * @return Propriété modifiée, qui reçoit le mapping.
     */
    private Property addAnalyzer(Property candidate) {
       Property.Builder propertyBuilder = new Property.Builder();
    
       TextProperty.Builder textPropertyBuilder = new TextProperty.Builder().analyzer("french");
       textPropertyBuilder.fields(candidate.text().fields());
    
       TextProperty textProperty = textPropertyBuilder.build();
       propertyBuilder.text(textProperty);
    
       return propertyBuilder.build();
    }
    

    It adds the "analyzer": "french" property to the text field description:

        "mappings": {
          "properties": {
            "acronyme": {
              "type": "text",
              "fields": {
                "keyword": {
                  "type": "keyword",
                  "ignore_above": 256
                }
              }
            },
            "catalogueId": {
              "type": "text",
              "fields": {
                "keyword": {
                  "type": "keyword",
                  "ignore_above": 256
                }
              }
            },
            "dateCreation": {
              "type": "long"
            },
            "dateDebut": {
              "type": "long"
            },
            "dateFin": {
              "type": "long"
            },
            "dateModification": {
              "type": "long"
            },
            "description": {
              "type": "text",
              "fields": {
                "keyword": {
                  "type": "keyword",
                  "ignore_above": 256
                }
              },
              "analyzer": "french"
            },