I have an Elasticsearch cluster in AWS and was alerted that the clusters minimum storage space was around 2gb. Before just upgrading the storage on each node I decided to dig a little deeper. For reference the cluster has 8 nodes with 35gb storage on each node. I am struggling to understand why the FreeStorageSpace metric for each node (and minimum FreeStorageSpace metric for the cluster) do not align.
Viewing Free storage space per node on the ES instance health tab:
When I cat/allocation:
Ultimately I am trying to decide whether the available storage space on my nodes reporting the least amount of storage space left is 2gb as per Cloudwatch metrics or 8.8gb as per the cat allocation api - This will help me decide how to scale. I understand that Amazon ES reserves a percentage of the storage space on each instance for internal operations but would assume this would reduce the disk.avail
in the image above. Any insights into why these aren't lining up would be fantastic.
This is because AWS Elasticsearch being a managed services, has its own storage overhead.
From AWS Documentation:
Operating system reserved space: By default, Linux reserves 5% of the file system for the root user for critical processes, system recovery, and to safeguard against disk fragmentation problems.
Amazon ES overhead: Amazon ES reserves 20% of the storage space of each instance (up to 20 GiB) for segment merges, logs, and other internal operations.
There are two metrics to view your free storage:
FreeStorageSpace CW Metric - This will incorporate the overhead and show the actual space available to the end user.
From AWS Documentation for FreeStorageSpace:
FreeStorageSpace will always be lower than the value that the Elasticsearch _cluster/stats API provides. Amazon ES reserves a percentage of the storage space on each instance for internal operations.
Elasticsearch API's - Since these are native Elasticsearch api's, they will display the raw space available which will be higher than the actual space.