Below is my shell script from which I am trying to invoke few hive SQL queries which is working fine.
#!/bin/bash
DATE_YEST_FORMAT1=`perl -e 'use POSIX qw(strftime); print strftime "%Y-%m-%d",localtime(time()- 3600*504);'`
echo $DATE_YEST_FORMAT1
hive -e "
SELECT t1 [0] AS buyer_id
,t1 [1] AS item_id
,created_time
FROM (
SELECT split(ckey, '\\\\|') AS t1
,created_time
FROM (
SELECT CONCAT (
buyer_id
,'|'
,item_id
) AS ckey
,created_time
FROM dw_checkout_trans
WHERE to_date(from_unixtime(cast(UNIX_TIMESTAMP(created_time) AS BIGINT))) = '$DATE_YEST_FORMAT1' distribute BY ckey sort BY ckey
,created_time DESC
) a
WHERE rank(ckey) < 1
) X
ORDER BY buyer_id
,created_time DESC;"
sleep 120
QUERY1=`hive -e "
set mapred.job.queue.name=hdmi-technology;
SELECT SUM(total_items_purchased), SUM(total_items_missingormismatch) from lip_data_quality where dt='$DATE_YEST_FORMAT2';"`
Problem Statement:-
If you see my first hive -e
block after the echo $DATE_YEST_FORMAT1
. Sometimes that query gets failed due to certain reasons. So currently what happens is that, if the first Hive SQL query
gets failed, then it goes to second Hive SQL query
after sleeping for 120 seconds
. And that is the thing I don't want. So Is there any way if the first query
gets failed dues to any reasons, it should get stopped automatically
at that point. And it should start running automatically from the starting again after few minutes(should be configurable)
Update:-
As suggested by Stephen
.
I tried something like this-
#!/bin/bash
hive -e " blaah blaah;"
RET_VAL=$?
echo $RET_VAL
if [ $RET_VAL -ne 0]; then
echo "HiveQL failed due to certain reason" | mailx -s "LIP Query Failed" -r rj@host.com rj@host.com
exit(1)
I got something like this below as an error and I didn't got any email too. Anything wrong with my syntax and approach?
syntax error at line 152: `exit' unexpected
Note:-
Zero is success here if the Hive Query is executed successfully.
Another Update after putting the space:- After making changes like below
#!/bin/bash
hive -e " blaah blaah;"
RET_VAL=$?
echo $RET_VAL
if [ $RET_VAL -ne 0 ]; then
echo "HiveQL failed due to certain reason for LIP" | mailx -s "LIP Query Failed" -r rj@host.com rj@host.com
fi
exit
hive -e 'Another SQL Query;'
I got something like below-
RET_VAL=0
+ echo 0
0
+ [ 0 -ne 0 ]
+ exit
Status code was zero
as my first query was successful but my program exited after that and it didn't went to execute my second query? Why? I am missing something here for sure again.
Unless I'm misunderstanding the situation, it's very simple:
#!/bin/bash
DATE_YEST_FORMAT1=`perl -e 'use POSIX qw(strftime); print strftime "%Y-%m-%d",localtime(time()- 3600*504);'`
echo $DATE_YEST_FORMAT1
QUERY0="
SELECT t1 [0] AS buyer_id
,t1 [1] AS item_id
,created_time
FROM (
SELECT split(ckey, '\\\\|') AS t1
,created_time
FROM (
SELECT CONCAT (
buyer_id
,'|'
,item_id
) AS ckey
,created_time
FROM dw_checkout_trans
WHERE to_date(from_unixtime(cast(UNIX_TIMESTAMP(created_time) AS BIGINT))) = '$DATE_YEST_FORMAT1' distribute BY ckey sort BY ckey
,created_time DESC
) a
WHERE rank(ckey) < 1
) X
ORDER BY buyer_id
,created_time DESC;"
if hive -e "$QUERY0"
then
sleep 120
QUERY1=`hive -e "
set mapred.job.queue.name=hdmi-technology;
SELECT SUM(total_items_purchased), SUM(total_items_missingormismatch) from lip_data_quality where dt='$DATE_YEST_FORMAT2';"`
# ...and whatever you do with $QUERY1...
fi
The string $QUERY0
is for convenience, not necessity. The key point is that you can test whether a command succeeded (returned status 0) with the if
statement. The test
command (better known as [
) is just a command that returns 0 when the tested condition is met, and 1 (non-zero) when it is not met.
So, the if
statement runs the first hive
query; if it passes (exit status 0), then (and only then) does it move on to the actions in the then
clause.
I've resisted the temptation to reformat your SQL; suffice to say, it is not the layout I would use in my own code.