Search code examples
datetimeamazon-redshiftdate-manipulation

Remaining Business Days This Month in Amazon Redshift


I would like to find a way to determine the remaining number of business days in this calendar month using Redshift. I currently have a MySQL version written by a friend. I don't know enough about how it was written to even translating into the other dialect. But if anyone could help translating it, this would a very useful tool!

The output should function just like the networkdays() function from excel. In this function, a begin_date and an end_date are provided as arguments for the function. It calculates the number of business days (non-weekend calendar days) between the begin and end dates inclusively.

Here is the current MySQL:

SELECT 1 AS pk ,COUNT(*) AS remaining
FROM (
    SELECT WEEKDAY(DATE(DATE_FORMAT(CONVERT_TZ(NOW(), 'UTC', 'PST8PDT'), '%Y-%m-01') + INTERVAL (a.num-1) DAY)) AS weekdays
        FROM (
        SELECT @row := @row + 1 AS num
        FROM schema.table t, (SELECT @row := 0) r
    ) a
        WHERE a.num >= DAY(CONVERT_TZ(NOW(), 'UTC', 'PST8PDT'))
        AND a.num <= DAY((DATE_FORMAT(CONVERT_TZ(NOW(), 'UTC', 'PST8PDT'), '%Y-%m-01') + INTERVAL 1 MONTH) - INTERVAL 1 DAY)
) b
WHERE b.weekdays NOT IN (0,6)

Any help would be great!


Solution

  • A popular idea for data warehouses is to create a calendar table that contains all dates (or at least dates relevant to the company), together with flags such as:

    • Public Holidays
    • Work days
    • First (work) day of month
    • Last (work) day of month
    • Month number
    • Week number
    • Day number

    While many of these values can be calculated via date functions, it can often be easier to join to the calendar table to perform some date functions.

    In the case of calculating remaining business days, it would just be a matter of counting the number of rows in the calendar table that are within the desired range, where the is_work_day flag is set. This could be done via a JOIN or a sub-query.

    Not as fancy as some of those queries but often much easier to maintain. Also, Amazon Redshift does not support the generate_series function, so it's often all that's possible.

    See also: