Search code examples
sqldatabasepostgresqljoinleft-join

Conditionally use set-retuning function as join clause


Context

I'm implementing a "job posting" feature that will list jobs available for users. A job can be set for either an interval start_date and end_date or sparse dates in an array service_dates. The availability check should be made using the previously described date information present at the jobs model against an availabilities model.

Their structures are the following

jobs

        Column          |             Type                               
-------------------------+------------------------------
 id                      | bigint                        
 service_dates           | timestamp without time zone[]
 start_date              | timestamp without time zone   
 end_date                | timestamp without time zone  

availabilities

        Column          |             Type                               
-------------------------+------------------------------
 id                      | bigint   
 user_id                 | integer                                          
 date                    | date
 spaces_left             | integer   

When joining then, I wanna cross the job's dates with the matching dates that exist in the availabilities table. The problem is that the job's dates are available in two different ways.

For sparse dates I can have it working if I do the following left join:

SELECT j.*
FROM jobs j
LEFT JOIN 
    availabilities a 
        ON a.user_id = 1234
        AND a.date in (
          --this will compose a set of the sparse dates
          SELECT UNNEST(j.service_dates)
        )

For the start_date and end_date option this will do the trick:

SELECT j.*
FROM jobs j
LEFT JOIN 
    availabilities a 
        ON a.user_id = 1234
        AND a.date in (
          --this will compose a set with the date range between the two dates
          SELECT GENERATE_SERIES(j.start_date::date, j.end_date::date, '1 day'::interval)
        )

My problem is that I need to apply either one or another based on the condition of jobs.service_dates being present for the particular job record. Originally I thought I could do the following:

SELECT j.*
FROM jobs j
LEFT JOIN 
    availabilities a 
        ON a.user_id = 1234
        AND a.date in (
          -- use one or the other based on whether or not there's something on j.service_dates
          CASE cardinality(j.service_dates) > 0
          WHEN TRUE THEN
             (SELECT UNNEST(j.services))
          ELSE
            (SELECT GENERATE_SERIES(j.start_date::date, j.end_date::date, '1) day'::interval)
          END
        )

But I get the following error: ERROR: more than one row returned by a subquery used as an expression

if I try to select from whatever would be returned by the CASE statement there I get the follwing: ERROR: set-returning functions are not allowed in CASE

I think I have what I need in terms of how to logically relate the tables within the scenarios that can be presented already figured out. What I need is a way to conditionally apply either logic.


Update 1

As @Kadet pointed out, user_id was missing from the availabilitites table so I've added that. His answer also lead me to accomplish what I wanted, I ended up not going with the exact same solution but I've applied the CASE when joining the way he suggested rather than the way I was trying to. There is nothing wrong with his answer, I just didn't use it because there were other particularities with this query that I left out of the question (for simplicity) that would work better with the way I chose. Here's how I did it:

LEFT JOIN
    availabilities a
      ON a.user_id = 1234
      AND case when cardinality(j.service_dates) > 0
        THEN a.date IN ( SELECT unnest(j.service_dates) )
        ELSE a.date IN ( SELECT GENERATE_SERIES(j.start_date::date, j.end_date::date, '1 day'::interval) )
      END

Update 2

An honorable mention to @Ely conceptually his answer is perfectly acceptable and I believe it would be correct. This would not work for me due to details I failed to expose on the first draft of my question, which is, service_dates and start_date and end_date aren't mutually exclusive. The former 2 are always present, being filled with service_date's min and max values. But his idea will probably work for anyone having a similar issue without that particular constraint. So up-vote him if that´s you :)


Solution

  • This table structure is invalid or incomplete (no user_id field in availabilities table) but just to show an idea

    
    SELECT j.*
    FROM jobs j
    LEFT JOIN 
        availabilities a 
            ON a.user_id = 1234
    WHERE 
       case when cardinality(j.service_dates) > 0
         then a.date = any(j.service_dates)
         else a.date between start_date and end_date
       end
    

    or, if generate series is required

    SELECT j.*
    FROM jobs j
    LEFT JOIN 
        availabilities a 
            ON a.user_id = 1234
    WHERE 
       case when cardinality(j.service_dates) > 0
         then a.date = any(j.service_dates)
         else a.date = any ( select GENERATE_SERIES(j.start_date::date, j.end_date::date, '1 day'::interval))
       end