I have setup GTM server side following this guide: https://aws-solutions-library-samples.github.io/advertising-marketing/using-google-tag-manager-for-server-side-website-analytics-on-aws.html
I am using AWS ECS task definitions and services. Later, I use Snowbridge to send data from AWS kinesis to GTM (snowplow client) using HTTP post requests.
When the data volume is high, I occasionally get a 502 error from GTM. If I filter out the data and reduce the amount of data being forwarded to GTM, I no longer get the error. What can I change on my GTM side to ensure that high amounts of data can be handled? Can I use automatic scaling in ECS?
I have already used parameters like
deployment_maximum_percent = 200
deployment_minimum_healthy_percent = 50
but the problem persists.
This is how my GTM configuration roughly looks like:
resource "aws_ecs_cluster" "gtm" {
name = "gtm"
setting {
name = "containerInsights"
value = "enabled"
}
}
resource "aws_ecs_task_definition" "PrimaryServerSideContainer" {
family = "PrimaryServerSideContainer"
network_mode = "awsvpc"
requires_compatibilities = ["FARGATE"]
cpu = 2048
memory = 4096
execution_role_arn = aws_iam_role.gtm_container_exec_role.arn
task_role_arn = aws_iam_role.gtm_container_role.arn
runtime_platform {
operating_system_family = "LINUX"
cpu_architecture = "X86_64"
}
container_definitions = <<TASK_DEFINITION
[
{
"name": "primary",
"image": "gcr.io/cloud-tagging-10302018/gtm-cloud-image",
"environment": [
{
"name": "PORT",
"value": "80"
},
{
"name": "PREVIEW_SERVER_URL",
"value": "${var.PREVIEW_SERVER_URL}"
},
{
"name": "CONTAINER_CONFIG",
"value": "${var.CONTAINER_CONFIG}"
}
],
"cpu": 1024,
"memory": 2048,
"essential": true,
"logConfiguration": {
"logDriver": "awslogs",
"options": {
"awslogs-group": "gtm-primary",
"awslogs-create-group": "true",
"awslogs-region": "eu-central-1",
"awslogs-stream-prefix": "ecs"
}
},
"portMappings" : [
{
"containerPort" : 80,
"hostPort" : 80
}
]
}
]
TASK_DEFINITION
}
resource "aws_ecs_service" "PrimaryServerSideService" {
name = var.primary_service_name
cluster = aws_ecs_cluster.gtm.id
task_definition = aws_ecs_task_definition.PrimaryServerSideContainer.id
desired_count = var.primary_service_desired_count
launch_type = "FARGATE"
platform_version = "LATEST"
scheduling_strategy = "REPLICA"
deployment_maximum_percent = 200
deployment_minimum_healthy_percent = 50
network_configuration {
assign_public_ip = true
security_groups = [aws_security_group.gtm-security-group.id]
subnets = data.aws_subnets.private.ids
}
load_balancer {
target_group_arn = aws_lb_target_group.PrimaryServerSideTarget.arn
container_name = "primary"
container_port = 80
}
lifecycle {
ignore_changes = [task_definition]
}
}
resource "aws_lb" "PrimaryServerSideLoadBalancer" {
name = "PrimaryServerSideLoadBalancer"
internal = false
load_balancer_type = "application"
security_groups = [aws_security_group.gtm-security-group.id]
subnets = data.aws_subnets.public.ids
enable_deletion_protection = false
}
....
I also tried adding these:
resource "aws_appautoscaling_target" "ecs_target" {
max_capacity = 4
min_capacity = 1
resource_id = "service/${aws_ecs_cluster.gtm.name}/${aws_ecs_service.PrimaryServerSideService.name}"
scalable_dimension = "ecs:service:DesiredCount"
service_namespace = "ecs"
}
resource "aws_appautoscaling_policy" "ecs_policy" {
name = "scale-down"
policy_type = "StepScaling"
resource_id = aws_appautoscaling_target.ecs_target.resource_id
scalable_dimension = aws_appautoscaling_target.ecs_target.scalable_dimension
service_namespace = aws_appautoscaling_target.ecs_target.service_namespace
step_scaling_policy_configuration {
adjustment_type = "ChangeInCapacity"
cooldown = 60
metric_aggregation_type = "Maximum"
step_adjustment {
metric_interval_upper_bound = 0
scaling_adjustment = -1
}
}
}
but the 502 errors persists.
You are looking in the right direction, and there are only two things left to do:
resource "aws_appautoscaling_policy" "ecs_policy"
to scale based on the metric from p.1Right now, your ecs_policy doesn't have any metric to scale on.
Here is example:
resource "aws_appautoscaling_policy" "ecs_target_cpu" {
name = "application-scaling-policy-cpu"
policy_type = "TargetTrackingScaling"
resource_id = aws_appautoscaling_target.ecs_service_target.resource_id
scalable_dimension = aws_appautoscaling_target.ecs_service_target.scalable_dimension
service_namespace = aws_appautoscaling_target.ecs_service_target.service_namespace
target_tracking_scaling_policy_configuration {
predefined_metric_specification {
predefined_metric_type = "ECSServiceAverageCPUUtilization"
}
target_value = 80
}
depends_on = [aws_appautoscaling_target.ecs_service_target]
}