Search code examples
pythonfiletarin-memory

How can I programmatically create a tar archive of nested directories and files solely from Python strings and without temporary files?


I want to create a tar archive with a hierarchical directory structure from Python, using strings for the contents of the files. I've read this question , which shows a way of adding strings as files, but not as directories. How can I add directories on the fly to a tar archive without actually making them?

Something like:

archive.tgz:
    file1.txt
    file2.txt
    dir1/
        file3.txt
        dir2/
            file4.txt

Solution

  • Extending the example given in the question linked, you can do it as follows:

    import tarfile
    import StringIO
    import time
    
    tar = tarfile.TarFile("test.tar", "w")
    
    string = StringIO.StringIO()
    string.write("hello")
    string.seek(0)
    
    info = tarfile.TarInfo(name='dir')
    info.type = tarfile.DIRTYPE
    info.mode = 0755
    info.mtime = time.time()
    tar.addfile(tarinfo=info)
    
    info = tarfile.TarInfo(name='dir/foo')
    info.size=len(string.buf)
    info.mtime = time.time()
    tar.addfile(tarinfo=info, fileobj=string)
    
    tar.close()
    

    Be careful with mode attribute since default value might not include execute permissions for the owner of the directory which is needed to change to it and get its contents.