I have an AWS Elastic Search server. Using mapping template and an index strategy.
"index_patterns": "users*",
"order": 6,
"version": 6,
"aliases": {
"users": {}
"settings": {
"number_of_shards": 5
"mappings": {
"_doc": {
"dynamic": "strict",
"properties": {
"id": { "type": "keyword" },
"emailAddress": { "type": "keyword" }
Index strategy is {index_patterns}-{yyyy}-{MM}-{order}-{version}
public async Task<Result> HandleEventAsync(UserChanged @event, CancellationToken cancellationToken)
// 1. Get User, I could get away with this call if Index was known and strategy not used
var userMaybe =
await _usersRepository.GetByIdAsync(@event.AggregateId.ToString(), cancellationToken);
if (userMaybe.HasValue)
var user = userMaybe.Value.User;
var partialUpdate = new
name = @event.Profile.Name,
birthDate = @event.Profile.BirthDate?.ToString("yyyy-MM-dd"),
gender = @event.Profile.Gender.ToString(),
updatedDate = DateTime.UtcNow,
updatedTimestampEpochInMilliseconds = EpochGenerator.EpochTimestampInMilliseconds(),
// 2. Remove fields with NULL values (if found any)
// 3. Partial or Full update of the document, in this case partial
var result = await _usersRepository.UpdateAsync(user.Id, partialUpdate, userMaybe.Value.Index, cancellationToken: cancellationToken);
return result.IsSuccess ? Result.Ok() : Result.Fail($"Failed to update User {user.Id}");
return Result.Fail("User doesn't exist");
So in this method I consume SQS message, I retrieve the document from Elastic Search for the reason of finding the index because I don't know it explicitly, remove any NULL fields using the below methods because serializer in update will include NULL values and then update the document partially.
This is 3 Elastic Search operation for 1 update, I understand the NULL values UpdateByQuery call can be removed with a decision to just tolerate null values in document but we might face the issue not able to query with Exists/NotExists for these fields if ever needed.
private async Task<Result> RemoveNullFieldsFromDocumentAsync(
object document,
string documentId,
string indexName = null,
string typeName = null,
CancellationToken cancellationToken = default)
var result = Result.Ok();
var allNullProperties = GetNullPropertyValueNames(document);
if (allNullProperties.AnyAndNotNull())
var script = allNullProperties.Select(p => $"ctx._source.remove('{p}')").Aggregate((p1, p2) => $"{p1}; {p2};");
result = await UpdateByQueryIdAsync(
cancellationToken: cancellationToken);
return result;
private static IReadOnlyList<string> GetNullPropertyValueNames(object document)
var allPublicProperties = document.GetType().GetProperties().ToList();
var allObjects = allPublicProperties.Where(pi => pi.PropertyType.IsClass).ToList();
var allNames = new List<string>();
foreach (var propertyInfo in allObjects)
if (propertyInfo.PropertyType == typeof(string))
var isNullOrEmpty = ((string) propertyInfo.GetValue(document)).IsNullOrEmpty();
if (isNullOrEmpty)
else if (propertyInfo.PropertyType.IsClass)
if (propertyInfo.GetValue(document).IsNull())
var namesWithobjectName = GetNullPropertyValueNames(propertyInfo.GetValue(document))
.Select(p => $"{propertyInfo.PropertyType.Name.ToCamelCase()}.{p.ToCamelCase()}");
return allNames;
public async Task<Result> UpdateByQueryIdAsync(
string documentId,
string script,
string indexName = null,
string typeName = null,
bool waitForCompletion= false,
CancellationToken cancellationToken = default)
Guard.Argument(documentId, nameof(documentId)).NotNull().NotEmpty().NotWhiteSpace();
Guard.Argument(script, nameof(script)).NotNull().NotEmpty().NotWhiteSpace();
var response = await Client.UpdateByQueryAsync<T>(
u => u.Query(q => q.Ids(i => i.Values(documentId)))
.Script(s => s.Source(script))
.Index(indexName ?? DocumentMappings.IndexStrategy)
.Type(typeName ?? DocumentMappings.TypeName),
var errorMessage = response.LogResponseIfError(_logger);
return errorMessage.IsNullOrEmpty() ? Result.Ok() : Result.Fail(errorMessage);
My question is, if I change the strategy to use a constant index for all users documents which they're not significant in number and will not really grow into billions at the moment, will I have a performance hit on Elastic Search, sharding/indexing etc?
Yes. A single index can handle a lot of data: you don't need to split them as small as you are. In fact, a small index, with small shards, is actually worse from a performance perspective since it leads to lots of shards per node, eating up heap space with overhead.
Creating a single date-based index makes sense if you have a lot of data coming in regularly, so maybe just the index_name-yyyyMMdd
pattern would work.
Last, you can always search across all your indices using wildcards. So you could search the above by querying index_name-*
. In your existing pattern, you could do the same: index_patterns-*
or index_patterns-yyyy-*
, etc.
Some info around shard sizing: https://www.elastic.co/blog/how-many-shards-should-i-have-in-my-elasticsearch-cluster