postgresql elixir phoenix-framework database-migration ecto

Ecto - The right way of migrating a field to a different type using custom logic?

I have a column :from, which was originally of type {:array, :string}. Now I want to migrate this column to be of type :string, taking the first entry of the array as the new value.

In Rails, you can do it with some custom logic in the migration. I'm trying to do the same with Ecto, but have run into problems due to schema validation and changeset errors.

defmodule Assistant.Repo.Migrations.ChangeFromFieldOnMails do
  use Ecto.Migration

  def up do
    dict_of_froms =
      Assistant.Mail
      |> Assistant.Repo.all()
      |> Enum.reduce(%{}, fn mail, acc ->
        Map.put(acc, mail.id, List.first(mail.from))
      end)

    alter table(:mails) do
      remove :from
      add :from, :string
    end

    Assistant.Mail
    |> Assistant.Repo.all()
    |> Enum.each(fn mail ->
      changeset = Ecto.Changeset.change(mail, from: Map.get(dict_of_froms, mail.id))
      Assistant.Repo.update!(changeset)
    end)
  end

  def down do
    dict_of_froms =
      Assistant.Mail
      |> Assistant.Repo.all()
      |> Enum.reduce(%{}, fn mail, acc ->
        Map.put(acc, mail.id, [mail.from])
      end)

    alter table(:mails) do
      remove :from
      add :from, {:array, :string}
    end

    Assistant.Mail
    |> Assistant.Repo.all()
    |> Enum.each(fn mail ->
      changeset = Ecto.Changeset.change(mail, from: Map.get(dict_of_froms, mail.id))
      Assistant.Repo.update!(changeset)
    end)
  end
end

The problem is that, I'll also have to change field :from, {:array, :string} to field :from, :string in my Mail schema, and this causes problems with the validation.

In the up step, Assistant.Repo.all() would fail because Ecto cannot load the from field from the old DB due to a type mismatch.

In the down step, Assistant.Repo.update!(changeset) would fail because Ecto.Changeset reported a type mismatch error on :from.

In Rails, there isn't really a strict check against the schema so you can get away with the code.

What is the right way of performing such migrations with Ecto? Is there no other way than writing custom SQL?

Solution

Based on the solution by apelsinka223 I was able to make it compile and work.

Some points worth noting:

I had to call flush() midway through the up and down functions, otherwise the removal and addition of columns wouldn't happen in time.
If a query is not based on a schema, one would need to explicitly use the select statement in the query for Ecto to run it.
update_all() needs at least two arguments. One can pass in [] as the second argument.

defmodule Assistant.Repo.Migrations.ChangeFromFieldOnMails do
  use Ecto.Migration
  import Ecto.Query, only: [from: 2]
  alias Assistant.Repo

  def up do
    query = from(m in "mails", select: {m.id, m.from})

    dict_of_froms =
      query
      |> Repo.all()
      |> Enum.reduce(%{}, fn {id, from}, acc ->
        Map.put(acc, id, List.first(from))
      end)

    alter table(:mails) do
      remove :from
      add :from, :string
    end

    flush()

    dict_of_froms
    |> Enum.each(fn {id, fr} ->
      query =
        from(m in "mails",
          where: m.id == ^id,
          update: [set: [from: ^fr]]
        )

      Repo.update_all(query, [])
    end)
  end

  def down do
    query = from(m in "mails", select: {m.id, m.from})

    dict_of_froms =
      query
      |> Repo.all()
      |> Enum.reduce(%{}, fn {id, from}, acc ->
        Map.put(acc, id, [from])
      end)

    alter table(:mails) do
      remove :from
      add :from, {:array, :string}
    end

    flush()

    dict_of_froms
    |> Enum.each(fn {id, fr} ->
      query =
        from(m in "mails",
          where: m.id == ^id,
          update: [set: [from: ^fr]]
        )

      Repo.update_all(query, [])
    end)
  end
end