My first question here! I've built a Data Pipeline for daily ETL whichs moves and tranforms data between Aurora, Redshift and Hive. All works well however I'm truly stuck on trying to implement a Shellcommandprecondition. The aim is to check the total row count in a view sitting on Aurora MySQL. If the view is empty (0 rows) then Data Pipeline should execute. If there are rows in the view - then the pipeline wait a bit, and then eventually fail after 4 retries.
Can someone help me out with the code for the actual check and query? This is what I've got so far but no luck with it:
#!/bin/bash
count=`mysql -u USER -pPW -h MASTERPUBLIC -p 3306 -D DBNAME -s -N -e "SELECT count(*) from MyView"`
if $count = 0
then exit 0
else exit 1
fi
In the pipeline definition it looks as follows:
{
"retryDelay": "15 Minutes",
"scriptUri": "s3://mybucket/ETLprecondition.bash",
"maximumRetries": "4",
"name": "CheckViewEmpty",
"id": "PreconditionId_pznm2",
"type": "ShellCommandPrecondition"
},
I have very little experience coding so I may be completely off...
Right, a few hours have passed and I finally solved it. There were a few issues holding me up.
Final and working code is:
#!/bin/bash
if type mysql >/dev/null 2>&1; then
count=`mysql -u USER -pPW -h MASTERPUBLIC -p 3306 -D DBNAME -s -N -e "SELECT count(*) from MyView"`
if [ "$count" -eq "0" ];
then exit 0
else exit 1
fi
else
sudo yum install -y mysql
count=`mysql -u USER -pPW -h MASTERPUBLIC -p 3306 -D DBNAME -s -N -e "SELECT count(*) from MyView"`
if [ "$count" -eq "0" ];
then exit 0
else exit 1
fi
fi