Search code examples
pythondatabasedatasetzodbobject-oriented-database

ZODB or other database for large data storage in python


I was using Zodb for large data storage which was in form of typical dictionary format (key,value). But while storing in ZODB i got following warning message:

C:\python-3.5.2.amd64\lib\site-packages\ZODB\Connection. py:550: UserWarning: The object you're saving is large. (510241658 bytes.)

Perhaps you're storing media which should be stored in blobs.

Perhaps you're using a non-scalable data structure, such as a PersistentMapping or PersistentList.

Perhaps you're storing data in objects that aren't persistent at all. In cases like that, the data is stored in the record of the containing persistent object.

In any case, storing records this big is probably a bad idea.

If you insist and want to get rid of this warning, use the large_record_size option of the ZODB.DB constructor (or the large-record-size option in a configuration file) to specify a larger size.

warnings.warn(large_object_message % (obj.class, len(p)))

please suggest how can i store large data in ZODB or suggest any other library for this purpose


Solution

  • Use BLOB support native in ZODB to store large data; anything else is an anti-pattern unless you have some application-specific need for some kind of cloud storage that cannot be supported on a local filesystem.

    You have not said what you are storing or what your storage configuration looks like, but I think that is invariant to the right approach: use BLOBs.

    How this works: Blob API stores objects using OID of the wrapper Persistent object (which is usually referenced as an attribute of your primary persisted object). The OID (internal ZODB object id) of the wrapping object is used as a key to find the BLOB data, fetch it, etc from your configured BLOB storage.

    Usually this is simply a file on the filesystem of your application, but may also be stored on the filesystem of your database server (ZEO, or RDBMS behind RelStorage, depending on configuration). It is possible that some databases (e.g. PostgreSQL backend to RelStorage) can store BLOBs using their native BLOB storage mechanisms, which ZODB (via RelStorage) offloads to.

    References:

    1. https://ziade.org/2007/09/14/to-blob-or-not-to-blob/

    2. https://stackoverflow.com/a/14645205/835961

    3. Useful libraries:

      a. z3c.blobfile (ZPL-licensed)

      b. plone.namedfile (BSD-licenced)