Search code examples
hadoophdfssftpoozieoozie-coordinator

HDFS file FTP from cluster to another machine


I want to create an Oozie workflow to transfer an HDFS file from an HDFS cluster to another server.

Since Oozie can run commands or scripts on any node in a system, is it possible to run a shell script or SFTP on one of the nodes and transfer the file to the destination server.


Solution

  • I think this task can be easily done by performing, from the remote server, a http GET (open operation) on the HDFS file (you can use curl for that).

    Anyway, if you want to do it through Oozie, I think you can create a script in charge of moving the desired file from HDFS to the local file system, and then perform a scp in order to move the file within the local file system to the remote file system.