Search code examples
elasticsearchlogstashlogstash-configuration

logstash output twitter to elasticsearch - how many indexes to have


Given logstash configs can have multiple inputs and outputs What considerations drive the decisions as to the number of indexes to have as outputs stored in elastic search if I'm using the twitter input on logstash?

Should I have 1 index per monitored account, 1 per tag or keyword or are there other considerations that would affect the design?


Solution

  • There is overhead in elasticsearch for each open index, so they'll each consume HEAP.

    It's common to put more than one type of document in an index (that's what the [type] field is for). Note that, in elasticsearch v2, any identically-named fields must have the same mapping ("myField", if a string in one type, must always be a string).

    Shards have a recommended upper limit on size, about 60GB IIRC.

    Finally, arrange your index so that enforcing your retention policy is easy. If everything is kept for 7 days, then a daily index would work well. Use 'curator' to delete old indexes.

    I prefer to make a smaller number of large indexes.