Search code examples
elixirecto

Trouble with select/3 using having/3 in Ecto queries


(Edit: added database representation and updated trials)

In our database we have a Member and a Membership schema. A Member has many Memberships. The Membership has fields of start_date and end_date. I am trying to query those Members that have more than one Membership and select the start_date and end_date of those Memberships. My question is, is there a way to do it in one query call without using the preload/3 function?

Our database can be represented by tuples:

# {Membership.member_id, Membership.start_date, Membership.end_date}

[
  {1, ~D[2019-03-12], ~D[2020-03-11]},
  {1, ~D[2019-04-05], ~D[2020-04-04]},
  {3, ~D[2019-04-25], ~D[2020-04-24]},
  {3, ~D[2020-06-12], ~D[2021-06-12]}
]

I have tried doing

Repo.all from m in Member,
      left_join: s in assoc(m, :memberships),
      group_by: [s.start_date, s.end_date],
      having: count(s) > 1,
      select: {s.start_date, s.end_date}

# Output: [{~D[2019-04-25], ~D[2020-04-24]}]

but all it gave me was the 3rd element from the database.

These are the two queries that I am currently using:

member_ids =
      Repo.all from m in Member,
      left_join: s in assoc(m, :memberships),
      group_by: s.member_id,
      having: count(s) > 1,
      select: s.member_id

# Output: [1, 3]

data =
      Repo.all from m in Member,
      left_join: s in assoc(m, :memberships),
      where: m.id in ^member_ids,
      select: {s.start_date, s.end_date}

# Output:
# [
#   {~D[2019-04-05], ~D[2020-04-04]},
#   {~D[2019-03-12], ~D[2020-03-11]},
#   {~D[2019-04-25], ~D[2020-04-24]},
#   {~D[2020-06-12], ~D[2021-06-12]}
# ]

The expected result would be a list of tuples, e.g.:

[
  {~D[2019-03-12], ~D[2020-03-11]},
  {~D[2019-04-05], ~D[2020-04-04]},
  {~D[2019-04-25], ~D[2020-04-24]},
  {~D[2020-06-12], ~D[2021-06-12]}
]

Solution

  • You can use a combination of array_agg and unnest functions to achieve desired result.

    Per information provided, it looks like you do not need to join members table to achieve it: querying on memberships should be enough.

    A pure SQL query to get the result closely resembling the one you provided would be this:

    # select unnest(array_agg((start_date, end_date))) from memberships group by member_id having count(1) > 1;
             unnest
    -------------------------
     (2019-03-12,2020-03-11)
     (2019-04-05,2020-04-04)
     (2019-04-25,2020-04-24)
     (2020-06-12,2021-06-12)
    (4 rows)
    

    As you can see, every row here is of type record. However, if we translate it to Ecto we'll get exactly what you outlined:

    iex(1)> import Ecto.Query
    Ecto.Query
    iex(2)> query =
    ...(2)>   from m in "memberships",
    ...(2)>     having: count(1) > 1,
    ...(2)>     group_by: m.member_id,
    ...(2)>     select: fragment("unnest(array_agg((?, ?)))", m.start_date, m.end_date)
    #Ecto.Query<from m0 in "memberships", group_by: [m0.member_id],
     having: count(1) > 1,
     select: fragment("unnest(array_agg((?, ?)))", m0.start_date, m0.end_date)>
    iex(3)> Repo.all(query)
    11:21:51.490 [debug] QUERY OK source="memberships" db=3.4ms
    SELECT unnest(array_agg((m0."start_date", m0."end_date"))) FROM "memberships" AS m0 GROUP BY m0."member_id" HAVING (count(1) > 1) []
    [
      {~D[2019-03-12], ~D[2020-03-11]},
      {~D[2019-04-05], ~D[2020-04-04]},
      {~D[2019-04-25], ~D[2020-04-24]},
      {~D[2020-06-12], ~D[2021-06-12]}
    ]
    iex(4)>
    

    Should you need to join members table (for example to do some record qualification) you are still able to do that with the suggested approach. For example:

      select unnest(array_agg((start_date, end_date)))
        from memberships
        join members on members.id = memberships.member_id
       where members.active
    group by member_id
      having count(1) > 1;
    

    Equivalent query expressed in Ecto would look like this:

    from m in "memberships",
      join: member in "members", on: member.id == m.member_id,
      having: count(1) > 1,
      where: member.active,
      group_by: m.member_id,
      select: fragment("unnest(array_agg((?, ?)))", m.start_date, m.end_date))