Search code examples
postgresqlelixirectoclickhouse

How to join multiple repositories with Ecto?


I have two databases, the main databases (PostgreSQL) + the statistics database (ClickHouse). Statistics database contains a subpart of data from the main database which is enough for performing calculations. All ids are similar (:binary_id) across both databases. I need to find a way of joining the results obtained from the statistic database with a query to the main database. In terms of pure SQL solution it could be something like this, where VALUES are data obtained from statistics database:

SELECT p0."id",
       p0."name",
       f1."average_count"
FROM "persons" AS p0
         JOIN (VALUES (0.0, '906af2c0-cde2-4996-9a98-bdbf986fe687'::uuid),
                      (0.2857142857142857, 'aba7c694-3453-4a55-aab9-4b542dbb4ba9'::uuid),
                      (0.2857142857142857, '2dab3350-6149-4752-a55e-7477a6ad0dd3'::uuid))
               as f1 (average_count, user_id)
              on f1.user_id = p0.id;

My project actively uses Ecto and has a lot of on-the-fly constructed queries. That's why I cannot just perform pure SQL queries as I post above and should have Ecto based solution. Is there a way to do such a joining with Ecto?


Solution

  • It's not pretty, but you could take advantage of Postgres' UNNEST:

    users = [
      %{id: "906af2c0-cde2-4996-9a98-bdbf986fe687", average_count: 0.0},
      %{id: "aba7c694-3453-4a55-aab9-4b542dbb4ba9", average_count: 0.2857142857142857},
      %{id: "2dab3350-6149-4752-a55e-7477a6ad0dd3", average_count: 0.2857142857142857}
    ]
    
    {ids, average_counts} =
      users
      |> Stream.map(&{&1.id, &1.average_count})
      |> Enum.unzip()
    
    dumped_ids =
      for id <- ids do
        {:ok, dumped} = Ecto.UUID.dump(id)
        dumped
      end
    
    query =
      from p in Person,
        join: f in fragment("SELECT UNNEST(?::uuid[]) AS user_id, UNNEST(?::float[]) AS average_count", ^dumped_ids, ^average_counts),
        on: f.user_id == p.id,
        select: %{id: p.id, name: p.name, average_count: f.average_count}
    
    Repo.all(query)
    

    Maybe it's not the best way of doing it. I'm no DB expert. But that works for me in IEx.