Search code examples
sqlsnowflake-cloud-data-platformgroup

SQL Snowflake Min Date For Each Record & Group By Month


I am using Snowflake to get some information from the data. I have a table in which the same employee (PRS_ID) can be included in many rows and I want to take the EARLIEST date (BGN_DATE) for each employee (PRS_ID). Next, I want to see how many employees I have each month.

Ideally if I could have in YYYY-MM format (YYYYMM will work too). What I was working with is below:

SELECT PRS_ID, MIN(BGN_DATE), MONTH(BGN_DATE)
FROM myTable
WHERE EMP_STS = 'T' 
GROUP BY PRS_ID, MONTH(BGN_DATE)

However, this will show me the data per employee (PRS_ID), which is too granular (as explained above). But when I remove "PRS_ID" from grouping I get below error:

SQL compilation error: error line 1 at position 7 'myTable.PRS_ID' in select clause is neither an aggregate nor in the group by clause.

Does anyone know how to fix it?

Thank you

Sample Data:

PRS_ID EMP_STS BGN_DATE
homsimps T 2022-01-30
homsimps T 2022-02-28
homsimps T 2022-03-30
bartsimps T 2022-01-30
bartsimps T 2022-02-28
bartsimps T 2022-03-31
lisasimps T 2022-04-30
lisasimps T 2022-05-31
lisasimps T 2022-06-30
lisasimps T 2022-07-30
margesimps T 2022-02-28
margesimps T 2022-03-30

Expected Outcome:

Period Count
2022-01 2
2022-02 1
2022-03 0
2022-04 1

Solution

  • Using aggregation twice:

    WITH cte AS (
       SELECT PRR_ID, MIN(BGN_DATE) AS min_bgn_date
       FROM my_table
       WHERE EMP_STS = 'T'
       GROUP BY PRS_ID
    )
    SELECT TO_VARCHAR(min_bgn_date, 'YYYYMM') AS month, COUNT(*) AS cnt
    FROM cte
    GROUP BY TO_VARCHAR(min_bgn_date, 'YYYYMM');
    -- GROUP BY month