python gzip.open: zlib.error: Error -3 while decompressing data: too many length or distance symbols

I want to decompress a huge gz file (wikidata json dump latest-all.json.gz , 104GB compressed) on the fly in python with gzip.open.

It works fine for a while. However, after reading 39.7 million lines it yields the error:

zlib.error: Error -3 while decompressing data: too many length or distance symbols

The function where I do the decompressing and reading looks like this:

import gzip
...
def wikidata(filename):
    with gzip.open(filename, mode='rt') as f:
        f.read(2) # skip first two bytes: "{\n"
        for line in f:
            try:
                yield json.loads(line.rstrip(',\n'))
            except json.decoder.JSONDecodeError:
                continue

The error in full is:

Traceback (most recent call last):
  File "parse.py", line 95, in <module>
    for line in lines:
  File "parse.py", line 21, in wikidata
    for line in f:
  File "/usr/lib/python3.8/gzip.py", line 305, in read1
    return self._buffer.read1(size)
  File "/usr/lib/python3.8/_compression.py", line 68, in readinto
    data = self.read(len(byte_view))
  File "/usr/lib/python3.8/gzip.py", line 487, in read
    uncompress = self._decompressor.decompress(buf, size)
zlib.error: Error -3 while decompressing data: too many length or distance symbols

What can be the reason for this? How can I solve the problem?

Solution

It means that the compressed data is corrupted at that point, or some short distance before it. The only way to solve the problem is to replace the input with a gzip file that is not corrupted.