Can't pipe two hadoop commands?

I want to run the following command:

hadoop fs -ls hdfs:///logs/ | grep -oh "/[^/]*.gz" | grep -oh "[^/]*.gz" | hadoop fs -put - hdfs:///unzip_input/input

It works when I call it from the shell after I ssh onto the master node. But it will not work if I try to call it through ssh as follows:

ssh -i /home/USER/keypair.pem hadoop@ec2-XXXX.compute-1.amazonaws.com hadoop fs -ls hdfs:///logs/ | grep -oh "/[^/]*.gz" | grep -oh "[^/]*.gz" | hadoop fs -put - hdfs:///unzip_input/input

It gives the error:

zsh: command not found: hadoop

But if I take out the last pipe the command succeeds:

ssh -i /home/USER/keypair.pem hadoop@ec2-XXXX.compute-1.amazonaws.com hadoop fs -ls hdfs:///logs/ | grep -oh "/[^/]*.gz" | grep -oh "[^/]*.gz"

From some searching I've found that it may be due to an error with the JAVA_HOME not being set, but it is set correctly in ~/.bashrc on the master node

The hadoop clustter is an Amazon Elastic Map Reduce cluster.

Solution

Only the first command of your piped command chain gets executed on the reomte host. The rest happens locally at your computer. So, of course, if you don't have hadoop installed, zsh will print out an error message (and otherwise, it would just put it onto your local Hadoop, which is probably not what you want.

To pass all commands to ssh, you can put them in quotes "" or single quotes '':

ssh -i /home/USER/keypair.pem hadoop@ec2-XXXX.compute-1.amazonaws.com 'hadoop fs -ls hdfs:///logs/ | grep -oh "/[^/]*.gz" | grep -oh "[^/]*.gz" | hadoop fs -put - hdfs:///unzip_input/input'