Search code examples
amazon-web-servicesamazon-s3amazon-ec2druidpydruid

How to write log and data in Druid Deep Storage in AWS S3


We have a druid cluster setup and now i am trying to write the indexing-logs and data into S3 deep storage.

Following are the details

druid.storage.type=s3
druid.storage.bucket=bucket-name
druid.storage.baseKey=druid/segments

# For S3:
druid.indexer.logs.type=s3
druid.indexer.logs.s3Bucket=your-bucket
druid.indexer.logs.s3Prefix=druid/indexing-logs

After running ingestion task i am getting below error

*Caused by: com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: HCAFAZBA85QW14Q0; S3 Extended Request ID: 2ICzpVAyFcy/PLrnsUWZBJwEo7dFl/S2lwDTMn+v83uTp71jlEe59Q4/vFhwJU5/WGMYramdSIs=; Proxy: null*)
 at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1862) ~[aws-java-sdk-core-1.12.37.jar:?]
 at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleServiceErrorResponse(AmazonHttpClient.java:1415) ~[aws-java-sdk-core-1.12.37.jar:?]
 at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1384) ~[aws-java-sdk-core-1.12.37.jar:?]
 at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1154) ~[aws-java-sdk-core-1.12.37.jar:?]
 at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:811) ~[aws-java-sdk-core-1.12.37.jar:?]
 at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:779) ~[aws-java-sdk-core-1.12.37.jar:?]
 at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:753) ~[aws-java-sdk-core-1.12.37.jar:?]

I tried to add the IAM role instance to the bucket level and same Role is running EC2 where Druid services are running.

Cam someone please guide what are the steps i am missing here.


Solution

  • I got it done!

    I have created a new IAM role and created a policy where i have given permission to S3 bucket and subfolder

    NOTE: Permission to S3 bucket is must Example: If bucket name is "Bucket_1" and subfolder where Deep storage is configured is "deep_storage"

    then make sure we should give permisson like:

    **"arn:aws:s3:::Bucket_1"
    "arn:aws:s3:::Bucket_1/*"**
    

    I was missing with not giving to Bucket level permission and directly trying to give permission to sub folder level.

    Also remove or comment out the below parameter from common.runtime.properties file from each servers of your Druid cluster

    **druid.s3.accessKey=
    druid.s3.secretKey=**
    

    After this config I can see the data is getting successfully to S3 deep storage with IAM role and not with Secret & Access Key.