Search code examples
sqlpostgresqlwindow-functions

PostgreSQL - Get value from from the same table


Get the average traffic from last week data per WEEK number and get the traffic data for last week Traffic(D-7)

For example if date = 5/13/2023, need to output traffic data (Traffic(D-7)) for date = 5/6/2023

enter image description here

I manage to get the Average but no idea how to retrieve the date-7 data and output it altogether

create table a
(
date  varchar(50),
Tname varchar(50),
Week varchar(5),
Traffic float
)

insert into  a values ('5/1/2023', 'ID1', '18', 7.98)
insert into  a values ('5/2/2023', 'ID1', '18', 4.44)
insert into  a values ('5/3/2023', 'ID1', '18', 5.66)
insert into  a values ('5/4/2023', 'ID1', '18', 10.01)
insert into  a values ('5/5/2023', 'ID1', '18', 9.41)
insert into  a values ('5/6/2023', 'ID1', '18', 6.71)
insert into  a values ('5/7/2023', 'ID1', '18', 8.24)
insert into  a values ('5/8/2023', 'ID1', '19', 8.97)
insert into  a values ('5/9/2023', 'ID1', '19', 6.74)
insert into  a values ('5/10/2023', 'ID1', '19', 6.45)
insert into  a values ('5/11/2023', 'ID1', '19', 9.33)
insert into  a values ('5/12/2023', 'ID1', '19', 8.08)
insert into  a values ('5/13/2023', 'ID1', '19', 8.36)


SELECT date, Tname, Week,
AVG(Traffic) OVER(PARTITION BY Week) AS AVTraffic
FROM a
ORDER BY week

http://sqlfiddle.com/#!18/538b7/3


Solution

  • First of all, you need to fix your flaws in your table schema design, and declare:

    • dates with the "DATE" type (instead of VARCHAR(50))
    • week values with the INT type (instead of VARCHAR(5))
    • traffic values with the DECIMAL type (instead of FLOAT)
    CREATE TABLE tab(
        DATE      DATE,
        Tname     VARCHAR(50),
        Week      INT,
        Traffic   DECIMAL(4,2)
    );
    

    Once you've carried it out, you can solve this problem by:

    • creating a ranking value for each day of the week in your weeks, using EXTRACT on your date
    • extracting your traffic value from previous week with LAG, by partitioning on your ranking created at previous step, and ordering on the week_number.
    WITH cte AS (
        SELECT date, Tname, Week, Traffic,
               ROUND(AVG(Traffic) OVER(PARTITION BY Week), 2) AS AVGTraffic,
               EXTRACT(ISODOW FROM date) - 1                  AS week_day
        FROM tab
    )
    SELECT date, Tname, Week, 
           LAG(Traffic) OVER(PARTITION BY week_day ORDER BY Week) AS prevweek_traffic,
           AVGTraffic
    FROM cte
    ORDER BY Week, week_day
    

    And if you realize that you may have holes among your weeks (..., week 17, week 18, week 20, week 21, ...) and specifically want values from the exact previous week (that may be missing), you can add a filter on the LAG function, that checks if week and previous week are consecutive:

    ...
    CASE WHEN LAG(Week) OVER(PARTITION BY week_day ORDER BY Week) = Week-1
         THEN LAG(Traffic) OVER(PARTITION BY week_day ORDER BY Week) 
    END
    ...
    

    (in place of LAG(Traffic) OVER(...) only)

    Output:

    date tname week prevweek_traffic avgtraffic
    2023-05-01T00:00:00.000Z ID1 18 null
    2023-05-02T00:00:00.000Z ID1 18 null
    2023-05-03T00:00:00.000Z ID1 18 null
    2023-05-04T00:00:00.000Z ID1 18 null
    2023-05-05T00:00:00.000Z ID1 18 null
    2023-05-06T00:00:00.000Z ID1 18 null
    2023-05-07T00:00:00.000Z ID1 18 null
    2023-05-08T00:00:00.000Z ID1 19 7.98
    2023-05-09T00:00:00.000Z ID1 19 4.44
    2023-05-10T00:00:00.000Z ID1 19 5.66
    2023-05-11T00:00:00.000Z ID1 19 10.01
    2023-05-12T00:00:00.000Z ID1 19 9.41
    2023-05-13T00:00:00.000Z ID1 19 6.71

    Check the demo here.

    This query allows any kind of holes in your data, if that's a needed requirement.

    Note: The last ORDER BY clause is not needed. It's there just for visualization purposes.