Search code examples
pythonpython-3.xtensorflowkeras

keras.utils.get_file() throws TypeError: '<' not supported between instances of 'int' and 'NoneType'


I am trying to follow along with the book Applied Deep Learning and Computer Vision for Self-Driving Cars. I am running into issues with keras while running some of the example code. When trying to grab a file using the get_file() function, I am getting a type error.

System: Windows 10 | Python 3.9.19 | Tensorflow 2.10.1

Code snippet: dataset_path = keras.utils.get_file("auto-mpg.data", "https://archive.ics.uci.edu/ml/machine-learning-databases/auto-mpg/auto-mpg.data")

Desired Behavior: Gets file

Resulting error:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[2], line 2
      1 # Datapath to import auto-mpg data
----> 2 dataset_path = keras.utils.get_file("auto-mpg.data", "https://archive.ics.uci.edu/ml/machine-learning-databases/auto-mpg/auto-mpg.data")

File ~\anaconda3\envs\AVTech\lib\site-packages\keras\utils\data_utils.py:296, in get_file(fname, origin, untar, md5_hash, file_hash, cache_subdir, hash_algorithm, extract, archive_format, cache_dir)
    294 try:
    295     try:
--> 296         urlretrieve(origin, fpath, DLProgbar())
    297     except urllib.error.HTTPError as e:
    298         raise Exception(error_msg.format(origin, e.code, e.msg))

File ~\anaconda3\envs\AVTech\lib\site-packages\keras\utils\data_utils.py:86, in urlretrieve(url, filename, reporthook, data)
     84 response = urlopen(url, data)
     85 with open(filename, "wb") as fd:
---> 86     for chunk in chunk_read(response, reporthook=reporthook):
     87         fd.write(chunk)

File ~\anaconda3\envs\AVTech\lib\site-packages\keras\utils\data_utils.py:78, in urlretrieve.<locals>.chunk_read(response, chunk_size, reporthook)
     76 count += 1
     77 if reporthook is not None:
---> 78     reporthook(count, chunk_size, total_size)
     79 if chunk:
     80     yield chunk

File ~\anaconda3\envs\AVTech\lib\site-packages\keras\utils\data_utils.py:287, in get_file.<locals>.DLProgbar.__call__(self, block_num, block_size, total_size)
    285     self.progbar = Progbar(total_size)
    286 current = block_num * block_size
--> 287 if current < total_size:
    288     self.progbar.update(current)
    289 elif not self.finished:

TypeError: '<' not supported between instances of 'int' and 'NoneType'

The only other post I could see related to this only had the answer "It works for me".


Solution

  • I can reproduce the error on colab with TF 2.10. Seems that this was fixed in the next version where the following lines were implemented:

    if total_size is None:
        self.progbar.update(current) 
    

    I'm not exactly sure why total_size cannot be determined but it seemed to be a recognised bug.

    As TF 2.10 was the last version for native Windows, I'd say you have 2 options:

    • use another function to download your data (you could use the "import in python" tutorial snippet on the page of the dataset even if this needs another dependency). Pandas could be also an option, but this file doesn't seem to be a csv file.
    • update to a newer Tensorflow version with WSL2 (link to docs) on Windows