Search code examples
springpostgresqlspring-bootkotlindatabase-migration

SpringBoot (Kotlin) - running a data migration in one time as a batch job


So, I have a specific problem and I can't find the Spring best practice for it.

I have a Spring instance in front of a Postgres DB. I have a domain entity that I'm adding some fields to (using flyway).

Essentially, imagine I have the following object:

Book
last_text_update // 2018-11-25 07:00:00
last_writer      // PUBLISHER
is_finished      // true
contract_closed_timestamp // NULL

From this, we do weird calculations like

"If last_writer == PUBLISHER && is_finished == true, return IS_SELLING_IN_STORES"

As you can imagine, this is ugly, and I'm refactoring it to the following:

Book
processedStatus: PUBLISHED/EDITING/PROPOSAL
workStatus: AWAITING_EDITOR_FEEDBACK/AWAITING_CHANGES/FINISHED
etc.etc.

So, I've prepared the SQL migrations that have added the necessary status columns. The way I see it, my options are to either:

1) Figure out all of the column transitions in raw SQL and run that migration on the production server like any other migration. i.e. where timestamp = blah and other timestamp = blah, set status as follows

2) Run a one-off batch job that cycles through each object and updates it.

It seems like it would be easier to do in Kotlin code via batch job vs. doing it in SQL, but what is the consensus on this? Is there a recommended to do a one-off batch job (i.e. ssh into an instance and manually run it, start it with HTTP over the API, etc.), or do I just suck it up and use SQL?


Solution

  • Migration using SQL (Option 1) seems more reasonable, given that

    1. you are already using Flyway and
    2. It seems ugly to me to pollute the entity model to allow it to do the data migration work. e.g.
      • you will need the entity model to behave given an "incomplete" persistence form, and
      • you need to keep the obsolete fields, and etc.

    Don't forget the performance difference if you are cycling through lots of entities too. Unless there are some special requirement that need you to do the modification using entity model (e.g. sending out events etc), it seems more reasonable to do the data migration using SQL.