Search code examples
pythongoogle-colaboratory

bsddb.btopen alternative on Google Colab?


So I have my Notebook on Google Colab using Python 3 (and I will implement some Deep learning libraries ex: Keras, TF, Flair, OpenAI...) so I really want to keep using Python 3 and not switch to 2.

However, I have a .db file that I want to open/read, the script is written in Python 2 because they are using bsddb library (which is deprecated and doesn't work on Python 3)

self.term_to_id = bsddb.btopen(resource_prefix + '_term_to_id.db', 'r')

I tried modifying the Python 2 file to make it compatible on Python 3 so I can import it as a module in my Google Colab Notebook, what I tried:

  1. I tried changing bsdbb to bsdbb3, and installing !pip install berkeleydb so I can do that later !pip install bsddb3 and just update bsdbb to bsdbb3 , but upon installing !pip install berkeleydbI get the following errors:

ERROR: Could not find a version that satisfies the requirement berkeleydb (from versions: 18.1.0, 18.1.1, 18.1.2, 18.1.3, 18.1.4) ERROR: No matching distribution found for berkeleydb

2)I thought maybe I could just import the dependency from python 2 file to my Python 3 notebook, but as expected it didn't work because it didn't recognize 'import bsdbb' in the Python 2 file.

Any tips/ work around to make it work on Google Colab ?


Solution

  • berkeleydb is only Python binding on database BerkeleyDB created in C/C++.

    When I try to install it on my local system Linux Mint then I see error with

    FileNotFoundError: [Errno 2] No such file or directory: 'src/Modules/berkeleydb.h'
    

    which means that it tries to compile some C/C++ code.

    And this usually need to install special package with C/C++ headers (files .h) with suffix -dev.

    Using

    !apt search Berkelay
    

    I found that there is installed libdb5.3 so I installed libdb5.3-dev

    !apt install libdb5.3-dev
    

    and after that Python can install berkeleydb


    This works for me on Colab

    !apt install libdb5.3-dev
    
    !pip install berkeleydb
    
    import berkeleydb as bsddb