Search code examples
elasticsearchjupyter-notebook

What is the reason for "elasticsearch.BadRequestError: BadRequestError(400, 'x_content_parse_exception', '[1:137] unknown field [xlm_roberta]')"?


  1. What do I want?

    I want to install an additional NLP model to Elasticsearch. The model is called multilingual-e5-base.

  2. What did I do?

    Therefore, I followed the steps in Elasticsearch documentation.

  3. What I expected?

    The multilingual-e5-base is successfully installed.

  4. What I tried?

Here is the relevant section in my Python Notebook for the installation:

!source ../.env && \
eland_import_hub_model \
    --url ${REMOTE_HOST} \
    --es-username ${USER} \
    --es-password ${API_KEY} \
    --hub-model-id intfloat/multilingual-e5-base \
    --es-model-id multilingual-e5-base \
    --task-type text_embedding \
    --start

Error message:

raise HTTP_EXCEPTIONS.get(meta.status, ApiError)(
elasticsearch.BadRequestError: BadRequestError(400, 'x_content_parse_exception', '[1:137] unknown field [xlm_roberta]')

Solution

  • Models initialized from xlm-roberta-base, such as intfloat/multilingual-e5-base, are only supported since version 8.9.

    The solution is that you upgrade to at least Elasticsearch 8.9 in order to be able to import this model.

    Also, looking at the variable called ${API_KEY}, it looks like you're using an API key instead of a password. If this is really an API key and not a password, the authentication should be done with the --es-api-key switch, instead of --es-username and --es-password.