Search code examples
sqlselectcaseteradataerror-code

Teradata: Error 3504 and generating subset of column in Select


Hi I was trying to work on a teradata sql problem where I need to exclude all saledate in Aug 2005 and calculate the daily revenue for each store/month/year combination for any stores that have no fewer than 20 sale days.

My idea is to generate a subset of the saledate column in the subquery and work with it. And here is my code.

SELECT Sub.store, Sub.Year_, Sub.Month_, Sub.TotalSaleDate, Sub.Daily_rev, sub.Total_rev
FROM (SELECT (CASE WHEN (NOT (EXTRACT(MONTH from saledate)=8 
             AND EXTRACT(YEAR from saledate)=2005)) THEN saledate END) AS 
             NewSaleDate, COUNT(NewSaleDate) AS TotalSaleDate,
             SUM(amt) AS Total_rev,
             Total_rev/TotalSaleDate AS Daily_rev,
             EXTRACT(MONTH from NewSaleDate) AS Month_,
             EXTRACT(YEAR from NewSaleDate) AS Year_, store
       FROM trnsact
       WHERE stype = 'P' AND saledate = NewSaleDate
       GROUP BY store, Year_, Month_, NewSaleDate) AS Sub
WHERE Sub.TotalSaleDate >= 20
ORDER BY sub.TotalSaledate ASC; 

And this is my output My result

Here is code from someone that worked

SELECT 
  sub.store, sub.year_num, sub.month_num, sub.num_dates, sub.daily_revenue
FROM (
  SELECT 
  store, 
  EXTRACT (month FROM saledate) AS month_num, 
  EXTRACT (year FROM saledate) AS year_num,
  COUNT (DISTINCT saledate) AS num_dates,
  SUM(amt) AS total_revenue,
  total_revenue/num_dates AS daily_revenue,
  (CASE 
  WHEN (year_num=2005 AND month_num=8) THEN 'cannot' ELSE 'can' 
  END) As can_use_anot
  FROM trnsact
  WHERE stype='p' AND can_use_anot='can'
  GROUP BY store, month_num, year_num
  ) AS sub
HAVING sub.num_dates >=20
GROUP BY sub.store, sub.year_num, sub.month_num, sub.num_dates, sub.daily_revenue
ORDER BY sub.num_dates ASC;

And his result Correct result

Apparently his Daily revenue is much higher than mine. I wonder whether it is due to I am not counting Distinct saledate in the subquery. However, I tried to add use COUNT(DISTINCT saledate) and I get no output at all, 0 row. I understand how his code works but I`m frustrated where is wrong in my code. ESPECIALLY WHY ADDING DISTINCT GAVE ME 0 ROW, DEEPLY APPRECIATE ANYONE WHO CAN EXPLAIN...


Solution

  • Select #1 agggregates by date while #2 by month, just compare the number of rows returned.

    What I don't understand why both queries use that strange way to exclude dates from Aug 2005. Should be done using a simple WHERE saledate not between date '2005-08-01' and date '2005-08-31'

    And there's no need for using a Derived Table:

    SELECT 
       store, 
       EXTRACT (month FROM saledate) AS month_num, 
       EXTRACT (year FROM saledate) AS year_num,
       COUNT (DISTINCT saledate) AS num_dates,
       SUM(amt) AS total_revenue,
       total_revenue/num_dates AS daily_revenue
    FROM trnsact
    WHERE saledate not between date '2005-08-01' and date '2005-08-31'
      and stype='p'
    GROUP BY store, month_num, year_num
    HAVING num_dates >=20
    ORDER BY sub.num_dates ASC;
    

    But there's probably a large number of rows per day/month and it might be more efficient to aggregate by day first avoiding EXTRACT on all rows and the COUNT(DISTINCT):

    SELECT -- now aggregate by month
       store, 
       EXTRACT (month FROM saledate) AS month_num, 
       EXTRACT (year FROM saledate) AS year_num,
       COUNT (*) AS num_dates, -- no need for DISTINCT anymore
       SUM(daily_amt) AS total_revenue,
       total_revenue/num_dates AS daily_revenue
    FROM
     (  -- daily sales first
        SELECT 
           store, 
           saledate, 
           SUM(amt) AS daily_amt
        FROM trnsact
        WHERE saledate not between date '2005-08-01' and date '2005-08-31'
          and stype='p'
        GROUP BY store, saledate
     ) as dt
    GROUP BY store, year_num, month_num
    ;