Search code examples
pythonparamikoscp

The tar.gz file transfer via SCP client cause corruption of file


  1. Compress several directories from Linux server via tar.gz
  2. Download compressed tar.gz file from server to Windows computer.
  3. Try to untar file via pythons's tarfile module
  4. Process pops one of Empty file (Fail to untar)

I need to make tar file in sever cause I need to transfer lots of small files(most of them are less than a kilobyte). So I try to 1) Compress files to tar.gz file from server 2) transfer via SCP1 client 3) remove tar file from server(if needed) 4) Extract downloaded file inside the my python program. 5) Create excel statistics.

I checked tar.gz file from serverside and I'am sure that file is not corrupted(I mean it is compressed well). It is extracted well without any error if I extract them inside the server ssh. But it pops an error above when I transfer tar.gz file from server via scp client inside the my program. And when I transfer file manually using FileZilla and extract using gitbash, it is not corrupted.

I checked many threads on the Internet, they usually say that it is scp binary mode problem. But I'm not sure what should I do to solve this problem.

I use scp and paramiko for the libaray. And this transfering phase is responsible for scp module. (I heared that it is re-created scp client module stems from paramiko

import paramiko
from scp import SCPClient
... # (Other class functions)
    def downloadCompressedFile(self, remote_paths, save_path):
        # binarial only
        # remote_paths :: Files to be tared
        # save_path :: Local path to be downloaded
        try:
            print('Compression Targets -->\n{}'.format(', '.join(remote_paths)))
            conn = self.getSSHConnection()
            tar_save_path = '{}/{}.tar.gz'.format(ROOT_TAR_PATH, datetime.now().strftime('%Y%m%d_%H%M%S'))
            obj = [ '-C {} ..'.format(p) for p in remote_paths]
            command = 'tar cvzf {} {}'.format(tar_save_path, ' '.join(obj))
            print('Remote Command -- tar -cvzf {} {}'.format(tar_save_path, ' '.join(obj)))
            conn.exec_command(command=command)
            print('Compressions are done. Downloading files from {} to {}'.format(tar_save_path, save_path))
            with SCPClient(conn.get_transport()) as scp:
                scp.get(remote_path=tar_save_path, local_path=save_path)

        except Exception as e:
            raise Exception(e)
...

It should transfer uncorrupted file.


Solution

  • I believe your code does not wait for the tar to complete. So you are downloading an incomplete file.

    See Wait until task is completed on Remote Machine through Python.

    Try this:

    stdin, stdout, stderr = client.exec_command(command)
    print('Compression started')
    stdout.channel.recv_exit_status() # Wait for tar to complete
    print('Compression is done. Downloading files from {} to {}'.format(tar_save_path, save_path))