User case: I want to develop a microservice with SpringBoot and ElasticSearch following search-as-user-type pattern. In o ther words, if I type "d" I want answer back Demetrio, Denis, Daniel. Typing second letter "e" brings Demetrio and Denis and finaly the third will retrieve the exact name. Even typing in middle letter should bring. "en" should bring Denis and Daniel. Pretty common case o f search as user type.
I am studing recommendations found in:
Current issue: when I boot my application aimed to create and set ElasticSearch I get the exception from this question topic. The index is created succesfully and my initial data loaded but it seems the analyzer is totally ignored.
Full logs while booting the SpringBoot:
2020-04-10 14:27:40.281 INFO 16556 --- [ main] com.poc.search.SearchApplication : Starting SearchApplication on SPANOT164 with PID 16556 (C:\WSs\elasticsearch\search\target\classes started by Cast in C:\WSs\elasticsearch\search)
2020-04-10 14:27:40.286 INFO 16556 --- [ main] com.poc.search.SearchApplication : No active profile set, falling back to default profiles: default
2020-04-10 14:27:40.863 INFO 16556 --- [ main] .s.d.r.c.RepositoryConfigurationDelegate : Bootstrapping Spring Data Elasticsearch repositories in DEFAULT mode.
2020-04-10 14:27:40.931 INFO 16556 --- [ main] .s.d.r.c.RepositoryConfigurationDelegate : Finished Spring Data repository scanning in 62ms. Found 1 Elasticsearch repository interfaces.
2020-04-10 14:27:41.101 INFO 16556 --- [ main] .s.d.r.c.RepositoryConfigurationDelegate : Bootstrapping Spring Data Reactive Elasticsearch repositories in DEFAULT mode.
2020-04-10 14:27:41.120 INFO 16556 --- [ main] .s.d.r.c.RepositoryConfigurationDelegate : Finished Spring Data repository scanning in 13ms. Found 0 Reactive Elasticsearch repository interfaces.
2020-04-10 14:27:42.343 INFO 16556 --- [ main] o.s.b.w.embedded.tomcat.TomcatWebServer : Tomcat initialized with port(s): 8080 (http)
2020-04-10 14:27:42.360 INFO 16556 --- [ main] o.apache.catalina.core.StandardService : Starting service [Tomcat]
2020-04-10 14:27:42.360 INFO 16556 --- [ main] org.apache.catalina.core.StandardEngine : Starting Servlet engine: [Apache Tomcat/9.0.33]
2020-04-10 14:27:42.496 INFO 16556 --- [ main] o.a.c.c.C.[Tomcat].[localhost].[/] : Initializing Spring embedded WebApplicationContext
2020-04-10 14:27:42.496 INFO 16556 --- [ main] o.s.web.context.ContextLoader : Root WebApplicationContext: initialization completed in 2122 ms
2020-04-10 14:27:43.221 INFO 16556 --- [ main] o.elasticsearch.plugins.PluginsService : no modules loaded
2020-04-10 14:27:43.222 INFO 16556 --- [ main] o.elasticsearch.plugins.PluginsService : loaded plugin [org.elasticsearch.index.reindex.ReindexPlugin]
2020-04-10 14:27:43.222 INFO 16556 --- [ main] o.elasticsearch.plugins.PluginsService : loaded plugin [org.elasticsearch.join.ParentJoinPlugin]
2020-04-10 14:27:43.222 INFO 16556 --- [ main] o.elasticsearch.plugins.PluginsService : loaded plugin [org.elasticsearch.percolator.PercolatorPlugin]
2020-04-10 14:27:43.222 INFO 16556 --- [ main] o.elasticsearch.plugins.PluginsService : loaded plugin [org.elasticsearch.script.mustache.MustachePlugin]
2020-04-10 14:27:43.222 INFO 16556 --- [ main] o.elasticsearch.plugins.PluginsService : loaded plugin [org.elasticsearch.transport.Netty4Plugin]
2020-04-10 14:27:45.480 INFO 16556 --- [ main] o.s.d.e.c.TransportClientFactoryBean : Adding transport node : 127.0.0.1:9300
2020-04-10 14:27:47.539 ERROR 16556 --- [ main] .d.e.r.s.AbstractElasticsearchRepository : failed to load elasticsearch nodes : org.elasticsearch.index.mapper.MapperParsingException: analyzer [autocomplete_index] not found for field [palavra]
2020-04-10 14:27:47.775 INFO 16556 --- [ main] o.s.s.concurrent.ThreadPoolTaskExecutor : Initializing ExecutorService 'applicationTaskExecutor'
2020-04-10 14:27:48.333 INFO 16556 --- [ main] o.s.b.w.embedded.tomcat.TomcatWebServer : Tomcat started on port(s): 8080 (http) with context path ''
2020-04-10 14:27:48.334 INFO 16556 --- [ main] com.poc.search.SearchApplication : Started SearchApplication in 8.714 seconds (JVM running for 9.159)
elastic-analyzer.json from resources/data/es-config
{
"analysis": {
"filter": {
"autocomplete_filter": {
"type": "edge_ngram",
"min_gram": 1,
"max_gram": 20
}
},
"analyzer": {
"autocomplete_search": {
"type": "custom",
"tokenizer": "standard",
"filter": [
"lowercase"
]
},
"autocomplete_index": {
"type": "custom",
"tokenizer": "standard",
"filter": [
"lowercase",
"autocomplete_filter"
]
}
}
}
}
ElasticSearchLoader
import com.fasterxml.jackson.databind.ObjectMapper;
import com.fasterxml.jackson.databind.type.CollectionType;
import com.fasterxml.jackson.databind.type.TypeFactory;
import com.poc.search.model.Correntista;
import com.poc.search.service.CorrentistaService;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.boot.CommandLineRunner;
import org.springframework.core.io.Resource;
import org.springframework.stereotype.Component;
import java.io.IOException;
import java.util.List;
import java.util.UUID;
import java.util.stream.Collectors;
@Component
public class ElasticSearchDataLoader implements CommandLineRunner {
@Value("classpath:data/correntistas.json")
private Resource usersJsonFile;
@Autowired
private CorrentistaService correntistaService;
@Override
public void run(String... args) throws Exception {
if (this.isInitialized()) {
return;
}
List<Correntista> users = this.loadUsersFromFile();
users.forEach(correntistaService::save);
}
private List<Correntista> loadUsersFromFile() throws IOException {
ObjectMapper objectMapper = new ObjectMapper();
CollectionType collectionType = TypeFactory.defaultInstance().constructCollectionType(List.class, CorrentistaInitData.class);
List<CorrentistaInitData> allFakeUsers = objectMapper.readValue(this.usersJsonFile.getFile(), collectionType);
return allFakeUsers.stream().map(this::from).map(this::generateId).collect(Collectors.toList());
}
private Correntista generateId(Correntista correntista) {
correntista.setId(UUID.randomUUID().toString());
return correntista;
}
private Correntista from(CorrentistaInitData correntistaJson) {
Correntista correntista = new Correntista();
correntista.setConta(correntistaJson.getConta());
correntista.setSobrenome(correntistaJson.getSobrenome());
correntista.setPalavra(correntistaJson.getNome());
return correntista;
}
private boolean isInitialized() {
return this.correntistaService.count() > 0;
}
}
Correntista model
@Document(indexName = "correntistas")
@Setting(settingPath = "es-config/elastic-analyzer.json")
@Getter
@Setter
public class Correntista {
@Id
private String id;
private String conta;
private String sobrenome;
@Field(type = FieldType.Text, analyzer = "autocomplete_index", searchAnalyzer = "autocomplete_search")
private String palavra;
}
application.yml
spring:
data:
elasticsearch:
cluster-name: docker-cluster
cluster-nodes: localhost:9300
application boot:
@EnableElasticsearchRepositories
@SpringBootApplication
public class SearchApplication {
public static void main(String[] args) {
SpringApplication.run(SearchApplication.class, args);
}
}
Elastic index settings
{
"correntistas": {
"settings": {
"index": {
"refresh_interval": "1s",
"number_of_shards": "5",
"provided_name": "correntistas",
"creation_date": "1586539666845",
"store": {
"type": "fs"
},
"number_of_replicas": "1",
"uuid": "2eEha4aMQm2bdut4pd0aAg",
"version": {
"created": "6080499"
}
}
}
}
}
all data initially loaded as expected
{
"took": 66,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 2,
"max_score": 1.0,
"hits": [
{
"_index": "correntistas",
"_type": "correntista",
"_id": "7353cd8c-791d-47f5-90b6-a1b5bcf83853",
"_score": 1.0,
"_source": {
"id": "7353cd8c-791d-47f5-90b6-a1b5bcf83853",
"conta": "1234",
"sobrenome": "Carvalho",
"palavra": "Demetrio"
}
},
{
"_index": "correntistas",
"_type": "correntista",
"_id": "122db1bc-584d-4bef-b5ea-3d9e0d42448e",
"_score": 1.0,
"_source": {
"id": "122db1bc-584d-4bef-b5ea-3d9e0d42448e",
"conta": "5678",
"sobrenome": "Carv",
"palavra": "Deme"
}
}
]
}
}
So, my main question is: why analyzer isn't created while Index is successfuly created? Surrounding question is why it pops up "failed to load elasticsearch nodes" since the data was loaded correctly?
in your descriptions of the files you write:
elastic-analyzer.json from resources/data/es-config
but in your @Setting
annotation the data part from that path is missing. You should change that to:
@Setting(settingPath = "data/es-config/elastic-analyzer.json")
or move the json file one directory up.
Because of this wrong path, the settings weren't written to the index on creation and therefore the analyzer is not available - which then leads to the error message you see.
Another thing: When loading your data, instead of calling save
with every entity object, you should collect them in a list and do a batch insert using saveAll
, that's much more performant.