I am new to Apache Airflow and so far, I have been able to work my way through problems I have encountered.
I have hit a wall now. I need to transfer files to a remote server via sftp. I have not had any luck doing this. So far, I have gotten S3 and Postgres/Redshift connections via their respective hooks to work in various DAGs. I have been able to use the FTPHook with success testing on my local FTP server, but have not been able to figure out how to use SFTP to connect to a remote host.
I am able to connect to the remote host via SFTP with FileZilla, so I know my credentials are correct.
Through Google searching I have found the SFTPOperator, but am not able to figure out how to use it. I have also found FTPSHook, but still I have not been able to get it to work.
I keep getting the error nodename nor servname provided, or not known
or a general Operation timed out
in my Airflow logs.
Can someone point me in the right direction? Should I be using the FTPSHook with SSH or FTP Airflow Conn Type? Or do I need to utilize the SFTPOperator? I am also confused as to how I am supposed to setup the credentials in my Airflow connections. Do I use the SSH profile or FTP?
If I can provide any more additional info that may help, please let me know.
Cheers!
SFTPOperator
is using ssh_hook
underhood to open sftp transport channel that serves as a basis for file transfer. You can either configure ssh_hook
by yourself or provide connection id via ssh_conn_id
.
op = SFTPOperator(
task_id="test_sftp",
ssh_conn_id="my_ssh_connection",
local_filepath="",
remote_filepath="",
operation=SFTPOperation.PUT,
dag=dag
)