I am new to ElastiSearch. I am trying to do bulk insert using Python into the elasticsearch index which is using nlp model through ingest pipeline to convert text into embeddings. But not all the documents are getting inserted only 2000 documents are inserting out of 40k documents.
Elastics Search Version 8.3
Below exception I am getting while calling bulk insert command
{'index': {'_index': 'index_name', '_id': '40962', 'status': 500, 'error': {'type': 'exception', 'reason': 'org.elasticsearch.common.util.concurrent.EsRejectedExecutionException: inference process queue is full. Unable to execute command', 'caused_by': {'type': 'es_rejected_execution_exception', 'reason': 'inference process queue is full. Unable to execute command'}}}},
Please
This is due to inference items being queued up and being rejected. This can happen when there are MANY items being ingested through a model that takes a while to infer.
The solution here is to:
queue_capacity
query parameter in the start deployment API)Some relevant documentation: https://www.elastic.co/guide/en/elasticsearch/reference/8.3/start-trained-model-deployment.html
example of starting a deployment with a specific capacity
POST _ml/trained_models/elastic__distilbert-base-uncased-finetuned-conll03-english/deployment/_start?wait_for=started&queue_capacity=2000