Search code examples

AWS IAM Setup for EC2 Resource in AWS Data Pipeline

I am having an issue getting AWS Data Pipeline to run on an EC2 Instance via a Shell Command Activity.

I have been following the guide found here step by step:

The primary issue I am running into is that the pipeline will hang on the WAITING_FOR_RUNNER Status. I have confirmed that my python script and .bat (had to change from .sh as I am using a windows ec2) run inside of the desired Ec2 instance. However, from what I can tell the issue is a result of the warning I am receiving from inside the Datapipline Architect:

WARNING: Could not validate S3 Access for role. Please ensure role ('DataPipelineDefaultRole') has s3:Get*, s3:List*, s3:Put* and sts:AssumeRole permissions for DataPipeline.

I have tried editing the IAM roles such that DataPipelineDefaultRole and DataPipelineDefaultResourceRole both have access to AmazonEc2FullAccess, AmazonS3FullAccess, AWSDataPipelineRole, AWSDataPipeline_FullAccess policies as well as trying the suggested inline policies shown here: AWS Data Pipeline: Issue with permissions S3 Access for IAM role and here

I have let these policies sit for hours and I have rebuilt the pipeline a few times but I still keep getting that specific warning. Do you have any ideas?


  • As per the AWS Data Pipeline documentation on AWS found below, the custom AMI must have Linux installed. This, therefore, cannot be completed currently on a Windows EC2 and must be completed on a Linux EC2.