Search code examples
pythonzodb

ZODB / repoze.catalog - saving objects and catalog in same database?


I thought I understood this but I'm beginning to wonder !

If you consider the repoze.catalog doco example :

from repoze.catalog.catalog import FileStorageCatalogFactory
from repoze.catalog.catalog import ConnectionManager

from repoze.catalog.indexes.field import CatalogFieldIndex
from repoze.catalog.indexes.text import CatalogTextIndex

factory = FileStorageCatalogFactory('catalog.db', 'mycatalog')

_initialized = False

def initialize_catalog():
    global _initialized
    if not _initialized:
        # create a catalog
        manager = ConnectionManager()
        catalog = factory(manager)
        # set up indexes
        catalog['flavors'] = CatalogFieldIndex('flavor')
        catalog['texts'] = CatalogTextIndex('text')
        # commit the indexes
        manager.commit()
        manager.close()
        _initialized = True

class Content(object):
    def __init__(self, flavor, text):
        self.flavor = flavor
        self.text = text

if __name__ == '__main__':
    initialize_catalog()
    manager = ConnectionManager()
    catalog = factory(manager)
    content = {
         1:Content('peach', 'i am so very very peachy'),
         2:Content('pistachio', 'i am nutty'),
         }
    for docid, doc in content.items():
        catalog.index_doc(docid, doc)
    manager.commit()

This shows you how to generate the catalog entries for two instances of the Content class but what is the correct mechanism for actually saving the object ?

I started out having a completely seperate ZODB database into which I stored the objects keyed on the docid used to catalog them under repoze.catalog but when it comes to transactions this is less than satisfactory because when adding an object I have to issue a commit on both the catalog and the ZODB database being used to store the objects.

I had assumed that I would be able to access the ZODB catalog which is within the repoze.catalog structures and use that to store the actual objects but I'm having difficulties finding out how to do that .


Solution

  • A catalogue like repoze.catalog is intended for indexing content, not storing it. It's intent is to make finding back your content (stored elsewhere) easy and performant, by indexing certain aspects of that content.

    The example given is completely standalone, and stores it's data in a separate ZODB file. This is to support the usecase where the catalog is used for data that itself is not stored in the ZODB.

    However, you are free to store the catalog in the same ZODB you store your content in. Your content objects should follow the basic rules for persistent objects, but you are otherwise free to architect the storage structures.

    To create a repoze.catalog catalog for yourself, not using the provided FileStorageCatalogFactory, simply instantiate repoze.catalog.catalog.Catalog:

    from repoze.catalog.catalog import Catalog
    
    if 'mycatalog' not in zodbroot:
        zodbroot['mycatalog'] = Catalog()