Search code examples
pythonjupyter-notebookpiphuggingface-transformerspython-importlib

How to install diff version of a package (transformers) without internet in kaggle notebook w/o killing the kernel while keeping variables in memory?


I have prepared an inference pipeline for a Kaggle competition and it has to be executed without internet connection.

I'm trying to use different versions of transformers but I had some issues regarding the installation part.

Kaggle's default transformers version is 4.26.1. I start with installing a different branch of transformers (4.18.0.dev0) like this.

!pip install ./packages/sacremoses-0.0.53
!pip install /directory/to/packages/transformers-4.18.0.dev0-py3-none-any.whl --find-links /directory/to/packages

It installs transformers-4.18.0.dev0 without any problem. I use this version of the package and do the inference with some models. Then I want to use another package open_clip_torch-2.16.0 which is compatible with transformers-4.27.3, so I install them by simply doing

!pip install /directory/to/packages/transformers-4.27.3-py3-none-any.whl --no-index --find-links /directory/to/packages
!pip install /directory/to/packages/open_clip_torch-2.16.0-py3-none-any.whl --no-index --find-links /directory/to/packages/

I get a prompt of Successfully installed transformers-4.27.3 and open_clip_torch-2.16.0.

!pip list | grep transformers outputs transformers 4.27.3 but when I do

import transformers
transformers.__version__

the version is '4.18.0.dev0'. I can't use open_clip because of that reason. Some of the codes are breaking because it uses the old version of transformers even though I installed a newer version. How can I resolve this issue?


Solution

  • When you initially import a module in a Python environment it is cached in sys.modules. Subsequent imports are not read from the disk but from the cache, for this reason you are not seeing the new version of the module being loaded.

    import sys
    import transformers
    sys.modules['transformers'].__version__
    

    A possible solution is to attempt to reload the module using importlib.reload.

    import importlib
    importlib.reload(transformers)
    sys.modules['transformers'].__version__
    

    Read the documentation so that you are aware of the caveats of using this method.