Search code examples
grafanaamazon-ecsamazon-efs

Grafana 7.4.3 /var/lib/grafana not writeable in AWS ECS - EFS


I'm trying to host a Grafana 7.4 image in ECS Fargate using an EFS volume for persistent storage.

Using Terraform I have created the required resource and given the task access to the EFS volume via an "access point"

resource "aws_efs_access_point" "default" {
    file_system_id = aws_efs_file_system.default.id

    posix_user {
        gid = 0
        uid = 472
    }

    root_directory {
        path = "/opt/data"

        creation_info {
            owner_gid = 0
            owner_uid = 472
            permissions = "600"
        }
    }
}

Note that I have set owner permissions as per the guides in https://grafana.com/docs/grafana/latest/installation/docker/#migrate-from-previous-docker-containers-versions (I've tried both group id 0 and 1 as the documentation seems to be inconsistent on the gid).

Using a base alpine image in place of the grafana image I've confirmed the directory /var/lib/grafana exists within container with the correct uid and gids set. However on attempting to run the grafana image I get the error message

GF_PATHS_DATA='/var/lib/grafana' is not writable.

I am launching the task with a terraformed task definition.

resource "aws_ecs_task_definition" "default" {
    family = "${var.name}"
    container_definitions = "${data.template_file.container_definition.rendered}"
    memory = "${var.memory}"
    cpu = "${var.cpu}"
    requires_compatibilities = [
        "FARGATE"
    ]
    network_mode = "awsvpc"
    execution_role_arn = "arn:aws:iam::REDACTED_ID:role/ecsTaskExecutionRole"

    volume {
        name = "${var.name}-volume"

        efs_volume_configuration {
            file_system_id = aws_efs_file_system.default.id
            transit_encryption = "ENABLED"
            root_directory = "/opt/data"

            authorization_config {
                access_point_id = aws_efs_access_point.default.id
            }
        }
    }

    tags = {
        Product = "${var.name}"
    }
}

With the container definition

[
    {
        "portMappings": [
            {
                "hostPort": 80,
                "protocol": "tcp",
                "containerPort": 80
            }
        ],
        "mountPoints": [
            {
                "sourceVolume": "${volume_name}",
                "containerPath": "/var/lib/grafana",
                "readOnly": false
            }
        ],
        "cpu": 0,
        "secrets": [
            ...
        ],
        "environment": [],
        "image": "grafana/grafana:7.4.3",
        "name": "${name}",
        "user": "472:0"
    }
]

For "user" I have tried "grafana", "742:0", "742" and "742:1" when trying gid 1.

I believe the terraform, security groups, mount_targets, etc... are all correct as I can get an alpine image to:

ls -lash /var/lib
> drw------- 2 472 root 6.0K Mar 12 11:22 grafana

Solution

  • I believe you have a problem, because AWS ECS https://github.com/aws/containers-roadmap/issues/938

    Anyway, file system approach doesn't seems to be very cloud friendly (especially if you want to scale horizontally: problems with concurrent writes from multiple tasks, IOPs limitations, ...). Just provision proper DB (e.g. Aurora RDS Mysql, multi A-Z cluster if you need HA) and you will have nice opsless AWS deployment.