Search code examples
phpvalidationsecurityelasticsearchsanitization

How to sanitize Elasticsearch autogenrated ID?


I would like to validate my Elasticsearch ID before calling the server. I was searching around to figure out if there is anyway to check if the requested ID have a correct format or not. I am not sure if this is necessary or not. I know that elasticsearch generates URL-safe Base64 ID's.

Anyone have a suggestion ? My question is :

  • Should I validate elasticsearch ID ? If yes how to validate the format before executing the query ?

  • If no.. is it secure to query the Elasticsearch server directly ? Users will not be able to input the ID, but some malicous users can intercept or figure out how to call the Endpoint with a random string, that could be a possible attack or access to a different set of data ?

I am an Elasticsearch beginner however am looking for best practice. I was reading that in previous versions there was a breach that opens to a remote code execution.

https://www.elastic.co/blog/scripting-security

http://bouk.co/blog/elasticsearch-rce/

I know this is fixed but still would like to validate the ID. This, at least will avoid making an unnecessary call to Elasticsearch if the ID is not in correct format. What I know is "never trust user input" or in my case avoid possible input..

Note : I am using elasticsearch-php Client.

Any suggestion ? Thanks.


Solution

  • Should I validate elasticsearch ID ?

    If you want. You could check that only alphanumerics, +, / and = are present. This adds an extra layer of security, but it should not be strictly necessary to do so. In the spirit of "defence in depth", I would recommend it.

    If no.. is it secure to query the Elasticsearch server directly ? Users will not be able to input the ID, but some [malicious] users can intercept or figure out how to call the Endpoint with a random string, that could be a possible attack or access to a different set of data ?

    If you use a tried and tested JSON encoder to build your query, then any attacker will not be able to break the query and retrieve data that they're not meant to (i.e. breaking out of a JSON string built with concatenation will not be possible - a form of NoSQL Injection). Do this even if you are validating the JSON string as I earlier described.