I have two data types.
data Foo = Foo
{ fooId :: RecordId Foo
, bars :: [RecordId Bar]
...
}
data Bar = Bar
{ barId :: RecordId Bar
...
}
This schema allows for each Foo to refer to an arbitrary list of Bars. Clearly, Bars can be shared among any number of Foos, or no Foos.
I already have data persisted in acid-state that uses this type of schema structure.
data Foo = Foo
{ fooId :: RecordId Foo
...
}
data Bar = Bar
{ barId :: RecordId Bar
, fooId :: RecordId Foo
...
}
In the desired state, each Bar must have exactly one Foo, as in common many-to-one SQL foreign key relationships.
Now of course, there is no way to perfectly transition between these two states, as the latter is less expressive than the former. However, I can write code that deals with any ambiguity here (for duplicate references, prefer the Foo with the smallest fooId, and simply delete any Bars that are not referenced by a Foo).
My issue is I cannot see any path, using Safecopy, to migrate between these two schemas. As far as I can tell, Safecopy defines migrations as pure functions between types and I cannot query the state of acid-state inside a migrate function. What I need here, though, is a migration that runs once, on the state at a specific point in time, and converts one schema into the other. With a database this would be trivial, but with acid-state I just can't see my way forward.
The only inkling towards a solution that I have is to have a separate program (or, say, command line feature callable from the main program) compiled specifically to run the few lines of code necessary to handle the data migration (so, say, all Foov0, Barv0 are converted to Foov1,Barv1) and then simply swap in the new schema in my main program.
However, I don't even see how this could work. In my understanding of safecopy, if I defined migrations to the new schema in the normal way then as soon as I try to access the data I will be given an instance of the new data type, which of course does not contain the data I need to actually migrate the data.
One (clumsy, it seems to me) option might be to define two further data types, copy the data across to them, then change the schema and run a migration that copies data back across to the new schema, then remove the further data types. Which requires three compilations of the program to run on the data sequentially, which somehow does not seem very elegant!
Any pointers would be greatly appreciated.
I neglected to mention that the schema above is wrapped in a data type that represents the entire state of the program, like
data DB = DB {
dbFoos :: [Foo],
dbBars :: [Bar]
}
I think this means that all I need to do is to define a new data DB and write a migration from DBv0 to DB, handling my data there without any need for sequencing or monadic activity. I will experiment with this and post this as an answer if successful.
In my particular circumstance, because the state was wrapped by a single DB type, the solution was to write a migration for the top level type. The migrate instance therefore had access to all of the data, so could run the necessary logic to complete the migration. So the solution looks something like this:
data DB = DB {
dbFoos :: [Foo],
dbBars :: [Bar]
}
data DB_v0 = DB_v0 {
v0_dbFoos :: [Foo_v0],
v0_dbBars :: [Bar_v0]
}
data Foo = Foo
{ fooId :: RecordId Foo
...
}
data Bar = Bar
{ barId :: RecordId Bar
, fooId :: RecordId Foo
...
}
data Foo_v0 = Foo_v0
{ v0_fooId :: RecordId Foo
, v0_bars :: [RecordId Bar]
...
}
data Bar_v0 = Bar_v0
{ v0_barId :: RecordId Bar
...
}
instance Migrate DB where
type MigrateFrom DB = DB_v0
migrate dbV0 = DB {
dbFoos = migrateOldFoos
,dbBars = migrateOldBars
}
where
migrateOldFoos :: [Foo]
-- (access to all old data possible here)
migrateOldBars :: [Bar]
-- (access to all old data possible here)
With relevant instances of migrate for Foo_v0 to Foo and Bar_v0 to Bar. One potential gotcha is that the definition of DB_v0 has to reference Foo_v0 and Bar_v0, otherwise SafeCopy would automatically migrate them to Foos and Bars, which would mean that the data was already gone before you were able to use it in the Migrate DB class.
SafeCopy = awesome