jdbc db4o database-schema object-oriented-database

Handling data maintenance in Object Databases like db4o

One thing I have continually found very confusing about using an object database like db4o is how you are supposed to handle complex migrations that would normally be handled by SQL/PL-SQL.

For example imagine you had a table in a relational database called my_users. Originally you had a column named "full_name", now that your software is in V2 you wish to remove this column, split the full names on a blank space and put the first part in a column named "first_name" and the second in a column named last_name. In SQL I would simply populate the "first_name" and "second_name" columns then remove the original column named "full_name".

How would I do this in something like db4o? Do I write a Java program that scripts looking up all objects of User.class, setting full_name to null while setting first_name and last_name? When I do my next svn commit there will be no field/bean-property corresponding to full_name, would this be a problem? It seems as though to use it in a production application where my "schema" changes I would want to write a script to migrate data from version x to version x+1 and then in version x+2 actually remove the properties I am trying to get rid of for version x+1 as I cannot write a Java script to modify properties that no longer are part of my type.

It seems that part of the problem is that an RDBMS resolves what object you are referring to based on a simple case insensitive string-based name, in a language like Java typing is more complicated than this, you cannot refer to a property if the getter/setter/field are not a member of the class loaded at runtime so you essentially need to have 2 versions of your code in the same script (hmm, custom classloaders sound like a pain), have the new version of your class stored belong to another package (sounds messy), or use the version x+1 x+2 strategy I mentioned (requires a lot more planning). Perhaps there is some obvious solution I never gleaned from the db4o documents.

Any ideas? Hopefully this makes some sense.

Solution

First, db4o handles the 'simple' scenarios like adding or removing a field automatically. When you adding the field, all existing object have the default value stored. When you remove a field, the data of existing object is still in the database and you can still access it. Renaming field etc are special 'refactoring'-calls.

Now your scenario you would do something like this:

Remove the field 'full_name', add the new fields 'first_name' and 'second_name'
Iterate over all 'Address'-objects
Access the old field via the 'StoredClass'-API
Split, change, update etc the value. Set the new values on the new field and store the object.

Let's assume we have a 'Address'-class. The 'full_name' field has been removed. Now we wan't to copy it to the 'firstname' and 'surname'. Then it could go like this (Java):

    ObjectSet<Address> addresses = db.query(Address.class);
    StoredField metaInfoOfField = db.ext().storedClass(Address.class).storedField("full_name", String.class);
    for (Address address : addresses) {
        String fullName = (String)metaInfoOfField.get(address);
        String[] splitName = fullName.split(" ");
        address.setFirstname(splitName[0]);
        address.setSurname(splitName[1]);
        db.store(address);
    }

As you suggested, you would write migration-code for each version-bump. It a field isn't part of your class anymore, you have to access it with 'StoredField'-API like above.

You can get a list of all 'stored' classes with ObjectContainer.ext().storedClasses(). With StoredClass.getStoredFields() you can get a list of all store fields, no mather is the field doesn't exist anymore in your class. If a class doesn't exist anymore, you can still get the objects and access it via 'GenericObject'-class.

Update: For complexer scenarios where a database needs to migrated over multiple-version-steps.

For example it in the version v3 the address-object looks completely different. So the 'migration-script' for v1 to v2 hasn't got the fields anymore it requires (firstname and surename in my example). I think there are multiple possibilities for handling this.

(Assuming Java for this idea. Certainly there's an equivalent in .NET). You could make the migration-step a Groovy-script. So each that each script does not interfere with another. Then you define 'classes' the needed classes for the migration there. So each migration has its own migration-classes. With aliases you would bind your groovy-migration-classes to the actual java-classes.
Creating refactoring-classes for complex scenarios. Also bind this classes with aliases.