Search code examples
ruby-on-railspostgresqlactiverecordarel

PG::InvalidColumnReference: ERROR: for SELECT DISTINCT, ORDER BY expressions must appear in select list


I have a long chain of associations, joins, order by, etc., that is ultimately selecting from one of my rails models. At the end of the day I need the results to be unique and sorted. I don't care what columns are used in the SELECT statement, what appears in the ORDER BY, etc. (these all change based on the filtering options the user has selected), I just care that the top level model/table in the query is unique (based on id).

For background, widgets is the main table, and we are joining with widget_steps, and this is in Rails 3 (company is trying to upgrade ASAP but that's what they're stuck with at the moment)

Here is the query and error that is being generated:

PG::InvalidColumnReference: ERROR:  for SELECT DISTINCT, ORDER BY expressions must appear in select list
LINE 1: ...completed_at" IS NULL)) ORDER BY sequential DESC, widget_s...
                                                             ^
: SELECT  DISTINCT "widgets".* FROM "widgets" INNER JOIN "widget_steps" ON "widget_steps"."widget_id" = "widgets"."id" INNER JOIN "widget_steps" "active_steps_widgets" ON "active_steps_widgets"."id" = "widgets"."active_widget_step_id" WHERE "widgets"."account_id" = 1 AND "widgets"."completed_at" IS NULL AND (("widgets"."sequential" = 't' AND "widget_steps"."assigned_to" = 5 AND "widget_steps"."id" = "widgets"."active_widget_step_id" AND "widget_steps"."completed_at" IS NULL) OR ("widgets"."sequential" = 'f' AND "widget_steps"."assigned_to" = 5 AND "widget_steps"."completed_at" IS NULL)) ORDER BY sequential DESC, widget_steps.name ASC LIMIT 10 OFFSET 0
Completed 500 Internal Server Error in 52.3ms

ActiveRecord::StatementInvalid - PG::InvalidColumnReference: ERROR:  for SELECT DISTINCT, ORDER BY expressions must appear in select list
LINE 1: ...completed_at" IS NULL)) ORDER BY sequential DESC, widget_s...
                                                             ^
: SELECT  DISTINCT "widgets".* FROM "widgets" INNER JOIN "widget_steps" ON "widget_steps"."widget_id" = "widgets"."id" INNER JOIN "widget_steps" "active_steps_widgets" ON "active_steps_widgets"."id" = "widgets"."active_widget_step_id" WHERE "widgets"."account_id" = 1 AND "widgets"."completed_at" IS NULL AND (("widgets"."sequential" = 't' AND "widget_steps"."assigned_to" = 5 AND "widget_steps"."id" = "widgets"."active_widget_step_id" AND "widget_steps"."completed_at" IS NULL) OR ("widgets"."sequential" = 'f' AND "widget_steps"."assigned_to" = 5 AND "widget_steps"."completed_at" IS NULL)) ORDER BY sequential DESC, widget_steps.name ASC LIMIT 10 OFFSET 0:

Why is this a thing? What does postgres think is so ambiguous about this? Why do queries like this always work fine in MySQL but make postgres choke.

I have tried:

  1. specifying .select([everything mentioned in the order by]).uniq at the end of the chain
  2. specifying .uniq at the end of the chain without doing a custom select
  3. writing some custom AREL to try to embed all of this in a sub query and then do the .uniq or the .order outside of this (can't get this working)
  4. doing the .uniq outside of postgres (this breaks because of pagination ... you can end up with some pages that only have 1 or 2 items on them because of duplicates being removed)
  5. crying

Solution

  • You need to add widget_steps.name to list of selected columns:

    SELECT DISTINCT "widgets".*,
                    "widget_steps.name"
    FROM "widgets"
    INNER JOIN "widget_steps" ON "widget_steps"."widget_id" = "widgets"."id"
    INNER JOIN "widget_steps" "active_steps_widgets" ON "active_steps_widgets"."id" = "widgets"."active_widget_step_id"
    WHERE "widgets"."account_id" = 1
      AND "widgets"."completed_at" IS NULL
      AND (("widgets"."sequential" = 't'
            AND "widget_steps"."assigned_to" = 5
            AND "widget_steps"."id" = "widgets"."active_widget_step_id"
            AND "widget_steps"."completed_at" IS NULL)
           OR ("widgets"."sequential" = 'f'
               AND "widget_steps"."assigned_to" = 5
               AND "widget_steps"."completed_at" IS NULL))
    ORDER BY sequential DESC,
             widget_steps.name ASC
    LIMIT 10
    OFFSET 0
    

    This should not change logic of your query and will work fine.

    In Rails you may use select method to set list of selected columns:

    Widget.select('"widgets".*, "widget_steps.name"')
    

    Hope this would help.