Search code examples
amazon-web-servicesamazon-s3grafana-loki

failed to delete corrupted cluster seed file, deleting it err="InvalidAccessKeyId"


Getting below error after running loki locally.

"failed to delete corrupted cluster seed file, deleting it" err="InvalidAccessKeyId: The AWS Access Key Id you provided does not exist in our records.\n\tstatus code: 403, request id: XXXXX, host id: XXXXXYYYZZZ"
  • loki - 2.8.2
  • mac Ventura 13.5
  • aws-cli/2.13.11 Python/3.11.4
  • aws credentials configured correctly using aws cli

loki -config.file=loki-local-config.yaml -config.expand-env=true

loki-local-config.yaml

auth_enabled: false

server:
  http_listen_port: 3100
  grpc_listen_port: 9096

ingester:
  wal:
    enabled: true
    dir: /tmp/loki/wal
  lifecycler:
    address: 127.0.0.1
    ring:
      kvstore:
        store: inmemory
      replication_factor: 1
    final_sleep: 0s
  chunk_idle_period: 1h
  max_chunk_age: 1h
  chunk_target_size: 1048576
  chunk_retain_period: 30s
  max_transfer_retries: 0

query_range:
  results_cache:
    cache:
      embedded_cache:
        enabled: true
        max_size_mb: 100

schema_config:
  configs:
    - from: 2020-10-24
      store: boltdb-shipper
      object_store: s3
      schema: v11
      index:
        prefix: index_
        period: 24h

storage_config:
  boltdb_shipper:
      active_index_directory: /tmp/loki/index
      cache_location: /tmp/loki/index_cache
      cache_ttl: 24h
      resync_interval: 5s
      shared_store: s3
  aws:
    endpoint: s3.us-west-1.amazonaws.com
    bucketnames: loki-poc-devops-ops-efficiency
    region: us-west-1
    access_key_id: aws_access_key_id
    secret_access_key: aws_secret_access_key
    insecure: false
    sse_encryption: false
    http_config:
      idle_conn_timeout: 90s
      response_header_timeout: 0s
      insecure_skip_verify: false
    s3forcepathstyle: true

compactor:
  working_directory: /tmp/loki/compactor
  shared_store: s3
  compaction_interval: 5m
  shared_store_key_prefix: index/

limits_config:
  reject_old_samples: true
  reject_old_samples_max_age: 168h
  retention_period: 744h

table_manager:
  retention_deletes_enabled: true
  retention_period: 24h

Solution

  • This error comes when either

    1. You gave the wrong or expired credentials.
    2. You missed a few things in the credentials.(This mistake I made)
    3. Your credentials are not parsed perfectly from the config file.

    In my case (2nd), I forgot to add session_token in config.
    Remember this is required when you use temporary security credentials to make programmatic requests for AWS resources using the AWS CLI or AWS API.

      aws:
        endpoint: s3.us-west-1.amazonaws.com
        bucketnames: loki-poc-devops-ops-efficiency
        insecure: false
        sse_encryption: false
        s3forcepathstyle: true
        region: us-west-1
        access_key_id: ${AWS_ACCESS_KEY_ID}
        secret_access_key: ${AWS_SECRET_ACCESS_KEY}
        session_token: ${AWS_SESSION_TOKEN}
        http_config:
          idle_conn_timeout: 90s
          response_header_timeout: 0s
          insecure_skip_verify: false
    

    Also, note that I change the creds style to get from environment variables as ${AWS_CRED_XYZ}, for this, you need to export them on the terminal or inside the bash profile, etc.
    and use -config.expand-env=true to expand these env variables while running loki command (loki -config.file=loki-local-config.yaml -config.expand-env=true)