I am running Airflow v2.3.2 / Python 3.10 from the Docker Image below.
apache/airflow:2.3.2-python3.10
The Docker Image has set paramiko==2.7.2
in order to address the authentication issues that had been seen in testing.
When calling the sftp, I am using the following:
sftp = SFTPHook("connection|sftp")
sftp.look_for_keys = False
sftp.get_conn()
I have also tried it without the sftp.look_for_keys
line.
In the Connections within the Airflow UI, I have configured the Extra
section as follows:
{
"private_key": "privatekeyinfo",
"no_host_key_check": true
}
"privatekeyinfo"
is string format "-----BEGIN OPENSSH PRIVATE KEY----- with '\n' line breaks written in.
When I test the connection within the UI, it reports Connection successfully tested
. However, when the script that calls the Hook runs, I receive the following:
[TIMESTAMP] {transport.py:1819} INFO - Connected (version 2.0, client dropbear)
[TIMESTAMP] {transport.py:1819} INFO - Authentication (password) failed.
I have also attempted to pass the "host_key"
in the Extras
field but get the same Authentication error.
To be explicit, I have tried the following -
sftp.look_for_keys = False
and "no_host_key_check": true
sftp.look_for_keys = False
and "host_key": "host_key_value"
#sftp.look_for_keys = False
and "no_host_key_check": true
#sftp.look_for_keys = False
and "host_key": "host_key_value"
Connections
in the Airflow is successful for "no_host_key_check": true
in Extras
Connections
in the Airflow is successful for "host_key": "host_key_value"
in Extras
Additional Logging from Paramiko -
[TIMESTAMP] {transport.py:1819} DEBUG - starting thread (client mode): 0x9e33d000
[TIMESTAMP] {transport.py:1819} DEBUG - Local version/idstring: SSH-2.0-paramiko_2.7.2
[TIMESTAMP] {transport.py:1819} DEBUG - Remote version/idstring: SSH-2.0-dropbear [SERVER]
[TIMESTAMP] {transport.py:1819} INFO - Connected (version 2.0, client dropbear)
[TIMESTAMP] {transport.py:1819} DEBUG - kex algos:['diffie-hellman-group1-sha1', 'diffie-hellman-group14-sha256', 'diffie-hellman-group14-sha1'] server key:['ssh-dss', 'ssh-rsa'] client encrypt:['blowfish-cbc', 'aes128-ctr', 'aes128-cbc', '3des-cbc'] server encrypt:['blowfish-cbc', 'aes128-ctr', 'aes128-cbc', '3des-cbc'] client mac:['hmac-sha1', 'hmac-md5-96', 'hmac-sha1-96', 'hmac-md5'] server mac:['hmac-sha1', 'hmac-md5-96', 'hmac-sha1-96', 'hmac-md5'] client compress:['none'] server compress:['none'] client lang:[''] server lang:[''] kex follows?False
[TIMESTAMP] {transport.py:1819} DEBUG - Kex agreed: diffie-hellman-group14-sha256
[TIMESTAMP] {transport.py:1819} DEBUG - HostKey agreed: ssh-rsa
[TIMESTAMP] {transport.py:1819} DEBUG - Cipher agreed: aes128-ctr
[TIMESTAMP] {transport.py:1819} DEBUG - MAC agreed: hmac-sha1
[TIMESTAMP] {transport.py:1819} DEBUG - Compression agreed: none
[TIMESTAMP] {transport.py:1819} DEBUG - kex engine KexGroup14SHA256 specified hash_algo <built-in function openssl_sha256>
[TIMESTAMP] {transport.py:1819} DEBUG - Switch to new keys ...
[TIMESTAMP] {transport.py:1819} DEBUG - Attempting password auth...
[TIMESTAMP] {transport.py:1819} DEBUG - userauth is OK
[TIMESTAMP] {transport.py:1819} INFO - Authentication (password) failed.
Additionally - The SFTP Server already has the public key and can be connected to using the private key (verified both using CyberDuck as well as a locally running version of Airflow).
Even on the hosted version of Airflow, in the Connections
section within the Admin
drop-down, when I go into the sftp connection and select Test
it returns Connection successfully tested
. The issue only occurs is within the DAG as it looks like it is trying to authenticate using a password instead of the private key that is provided for that connection.
Link to Airflow GH discussion - https://github.com/apache/airflow/discussions/31318
After a lot of tinkering, ended up finding a solution that doesn't directly use the SFTPHook.
def send_to_sftp(filename):
var_key = Variable.get("Key")
ssh = paramiko.SSHClient()
ssh.set_missing_host_key_policy(paramiko.client.WarningPolicy)
host = "host.com"
key = paramiko.RSAKey.from_private_key(io.StringIO(var_key))
ssh.connect(host, port=###, username='xxxxxxxx', pkey=key, disabled_algorithms=dict(pubkeys=["rsa-sha2-512", "rsa-sha2-256"]))
sftp = ssh.open_sftp()
sftp.put('./'+filename,filename)
print('File uploaded')
It also appears, after testing locally on a machine that hadn't created the pub/private key pair, that installing a newer sftp and ssh provider did the trick.
The root of the problem came from the authentication not looking to the private key that was passed in the Extras
field and then failing when it could not find a password.