(please be patient with a self-taught but fascinated beginner in postgres :)
In POSTGRES 10 I have come up with a table to track the progress in editing and archiving photo folders. This progress is measured in sequential steps, which link via foreign_key to another table that basically states 0 = "open", 1 = "work", 2 = "done". Hence I use numeric values to reference these states.
To keep the question simple here are the most important columns with some demo data:
folder_name | archive_status | step_01 | step_02 | step_03 | ...
----------------------------------------------------------------------------
john | * | 0 | 2 | 0 | ...
paul | * | 0 | 0 | 2 | ...
george | * | 2 | 1 | 0 | ...
ringo | * | 0 | 2 | 0 | ...
The desired result would be something like:
folder_name | archive_status | step_01 | step_02 | step_03 | ...
----------------------------------------------------------------------------
john | step_02 | 0 | 2 | 0 | ...
paul | step_03 | 0 | 0 | 2 | ...
george | step_01 | 2 | 1 | 0 | ...
ringo | step_02 | 0 | 2 | 0 | ...
My two problems are:
I tried approaching it with the Aggregate Functions https://www.postgresql.org/docs/10/tutorial-agg.html but I am kind of stuck, since I don't need multiple input rows, but filtering multiple columns within a row.
--- UPDATE: New problem ---
Sorry for that, but my initial demo data was not clear enough, these columns act more as a progress tracker, where in fact EACH cell within these columns could show any value. (This is where the original solution in the answer fails, since if there are two same "highest" values in a row, it will use the first occurence.) So the SQL query should find the column at the farthest to the right so to speak. Of cource I could come up with some clever "score calculations" over here, but in fact things would be so much easier with this current "matrix style" design.
So again, the desired result (archive_status) according to the values (step_xx) would be:
folder_name | archive_status | step_01 | step_02 | step_03
--------------------------------------------------------------------
john | step_02 | 2 | 2 | 0
paul | step_03 | 1 | 1 | 2
george | step_01 | 2 | 1 | 1
ringo | step_02 | 2 | 2 | 1
One option uses greatest()
and a case
expression:
select t.*,
case greatest(step_01, step_02, step_03)
when step_01 then 'step_01'
when step_02 then 'step_02'
when step_03 then 'step_03'
end as archive_status
from mytable t
While this would work for your current problem, I would recommend normalizing your design. Each step should be stored in a separate row rather than as a column, in a structure like (folder_name, step, status)
. Then you would use distinct on
:
select distinct on (folder_name) t.*
from newtable t
order by folder_name, status desc, step