I have created a .tar file on a Linux machine as follows:
tar cvf test.tar test_folder/
where the test_folder contains some files as shown below:
test_folder
|___ file1.jpg
|___ file2.jpg
|___ ...
I am unable to programmatically extract the individual files within the tar archive using Python. More specifically, I have tried the following:
import tarfile
with tarfile.open('test.tar', 'r:') as tar:
img_file = tar.extractfile('test_folder/file1.jpg')
# img_file contains the object: <ExFileObject name='test_folder/test.tar'>
Here, the img_file
does not seem to contain the requested image, but rather it contains the source .tar
file. I am not sure, where I am messing things up. Any suggestions would be really helpful. Thanks in advance.
Appending 2 lines to your code will solve your problem:
import tarfile
with tarfile.open('test.tar', 'r:') as tar:
img_file = tar.extractfile('test_folder/file1.jpg')
# --------------------- Add this ---------------------------
with open ("img_file.jpg", "wb") as outfile:
outfile.write(img_file.read())
The explanation:
The .extractfile()
method only provided you the content of the extracted file (i.e. its data).
So you have do it yourself - by reading this returned content (img_file.read()
) and writing it into a file of your choice (outfile.write(...)
).
Or — to simplify your life — use the .extract()
method instead. See my other answer.