python: tarfile extraction error IOError: [Errno 22] invalid mode ('wb') or filename

I'm extracting a file using tarfile. Unfortunately this compressed file came from a linux server, and contains several files that contain illegal Windows OS characters for files (':').

I'm using the below:

extract = tarfile.open(file)
extract.extractall(path=new_path)
extract.close()

I get the following error: IOError: [Errno 22] invalid mode ('wb') or filename: ... "file::ext"

So I tried passing the error with:

try:
    extract = tarfile.open(file)
    extract.extractall(path=new_path)
    extract.close()
except IOError:
    pass

That does work, however the extraction does not continue. It just stops with this failure.

When I extract the archive with WinRAR, the file is automatically renamed to "file__ext".

Is there a WinRAR extension to python? Or maybe a way to skip the error and continue the extraction? Or automatically rename the file like WinRAR does. I don't mind if the file will be skipped.

I saw several posts with this error, however all of them were for compressing, not extracting.

Solution

extract = tarfile.open(file)
for f in extract:
    # add other unsavory characters in the brackets
    f.name = re.sub(r'[:]', '_', f.name)
extract.extractall(path=new_path)
extract.close()

(Changes won't be saved to the original file b/c we're opening it in read mode by default.)