Search code examples
sqlhaskellyesodesqueleto

Unique post author from Esqueleto


I have an Esqueleto query that selects all of the StatusUpdates and their respective Users. I'd like to limit it to only one StatusUpdate per User, and only StatusUpdates from the current day.

I have a working SQL query, I'm just struggling with turning it into valid Esqueleto.

SELECT "email", "subject"
FROM "status_update"
LEFT JOIN "user"
ON ("status_update"."user" = "user"."id")
WHERE "status_update"."id"
IN (SELECT MAX("status_update"."id") FROM "status_update" GROUP BY "status_update"."user")

What I have so far:

  statusUpdates <- runDB
    $ E.select
    $ E.from $ \(status_update `E.LeftOuterJoin` user) -> do

      E.on (status_update ^. StatusUpdateUser E.==. user ^. UserId)

      E.where_ ((status_update ^. StatusUpdateId) `E.in_`
        (E.subList_select $ E.from $ \(status_update) -> do
          E.groupBy (status_update ^. StatusUpdateUser)
          return (status_update ^. StatusUpdateId)))

      return
        ( status_update ^. StatusUpdateId
        , status_update ^. StatusUpdateSubject
        , status_update ^. StatusUpdateMessage
        , user ^. UserEmail
        )

…Which produces the following output:

SELECT "status_update"."id", "status_update"."subject", "status_update"."message", "user"."email"
FROM "status_update"
LEFT OUTER JOIN "user"
ON "status_update"."user" = "user"."id"
WHERE "status_update"."id"
IN (SELECT "status_update2"."id" FROM "status_update" AS "status_update2" GROUP BY "status_update2"."user"); []

It seems the only thing that's missing is the MAX function in my second select. I've been trying to squeeze E.max_ into different parts of the query but I can't seem to make anything work.

Help?


Solution

  • Instead of using subList_select and trying to call max_, you could just use sub_select and have the subquery sort by date and limit 1.

    This is the solution that I found, which seems to do the trick:

        result <- select $ from $ \(user, status_update) -> do
            let subquery = from $ \status_update2 -> do
                where_ (status_update2 ^. StatusUpdateUser ==. user ^. UserId)
                where_ (date (status_update2 ^. StatusUpdatePosted) ==. date now)
                orderBy [desc (status_update2 ^. StatusUpdatePosted)]
                limit 1
                return (status_update2 ^. StatusUpdateId)
    
            where_ (status_update ^. StatusUpdateId ==. sub_select subquery)
            where_ (user ^. UserId ==. status_update ^. StatusUpdateUser)
            return
                ( status_update ^. StatusUpdateId
                , status_update ^. StatusUpdateSubject
                , status_update ^. StatusUpdateMessage
                , user ^. UserEmail
                )
    

    The resulting SQL is:

    SELECT "status_update"."id", "status_update"."subject", "status_update"."message", "user"."email"
    FROM "user", "status_update"
    WHERE ("status_update"."id" = (SELECT "status_update2"."id"
        FROM "status_update" AS "status_update2"
        WHERE ("status_update2"."user" = "user"."id")
        AND (date("status_update2"."posted") = date(date(?)))
        ORDER BY "status_update2"."posted" DESC
        LIMIT 1))
    AND ("user"."id" = "status_update"."user")
    

    To satisfy the "only StatusUpdates from the current day" condition I have defined date and now by importing Database.Esqueleto.Internal.Sql and:

    date :: SqlExpr (Value UTCTime) -> SqlExpr (Value Int)
    date d = unsafeSqlFunction "date" d
    
    now :: SqlExpr (Value UTCTime)
    now = unsafeSqlFunction "date" (val "now" :: SqlExpr (Value String))
    

    However, what exactly "from the current day" means to you could be something different (past 24 hours, in a certain timezone, etc.).