Search code examples
sqlsql-serversql-server-2005

multiple transactions within a certain time period, limited by date range


I have a database of transactions, people, transaction dates, items, etc. Each time a person buys an item, the transaction is stored in the table like so:

personNumber, TransactionNumber, TransactionDate, ItemNumber

What I want to do is to find people (personNumber) who, from January 1st 2012(transactionDate) until March 1st 2012 have purchased the same ItemNumber multiple times within 14 days (configurable) or less. I then need to list all those transactions on a report.

Sample data:

personNumber, TransactionNumber, TransactionDate, ItemNumber
1           |               100|      2001-01-31|        200
2           |               101|      2001-02-01|        206
2           |               102|      2001-02-11|        300
1           |               103|      2001-02-09|        200
3           |               104|      2001-01-01|        001
1           |               105|      2001-02-10|        200
3           |               106|      2001-01-03|        001
1           |               107|      2001-02-28|        200

Results:

personNumber, TransactionNumber, TransactionDate, ItemNumber
1           |               100|      2001-01-31|        200
1           |               103|      2001-02-09|        200
1           |               105|      2001-02-10|        200
3           |               104|      2001-01-01|        001
3           |               106|      2001-01-03|        001

How would you go about doing that?

I've tried doing it like so:

select * 
from (
    select personNumber, transactionNumber, transactionDate, itemNumber,
count(*) over (
    partition by personNumber, itemNumber) as boughtSame)
from transactions
where transactionDate between '2001-01-01' and '2001-03-01')t
where boughtSame > 1

and it gets me this:

personNumber, TransactionNumber, TransactionDate, ItemNumber
1           |               100|      2001-01-31|        200
1           |               103|      2001-02-09|        200
1           |               105|      2001-02-10|        200
1           |               107|      2001-02-28|        200
3           |               104|      2001-01-01|        001
3           |               106|      2001-01-03|        001

The issue is that I don't want TransactionNumber 107, since that's not within the 14 days. I'm not sure where to put in that limit of 14 days. I could do a datediff, but where, and over what?


Solution

  • Alas, the window functions in SQL Server 2005 just are not quite powerful enough. I would solve this using a correlated subquery.

    The correlated subquery counts the number of times that a person purchased the item within 14 days after each purchase (and not counting the first purchase).

    select t.*
    from (select t.*,
                 (select count(*)
                  from t t2
                  where t2.personnumber = t.personnumber and
                        t2.itemnumber = t.itemnumber and
                        t2.transactionnumber <> t.transactionnumber and
                        t2.transactiondate >= t.transactiondate and 
                        t2.transactiondate < DATEADD(day, 14, t.transactiondate
                 ) NumWithin14Days
          from transactions t
          where transactionDate between '2001-01-01' and '2001-03-01'
         ) t
    where NumWithin14Days > 0
    

    You may want to put the time limit in the subquery as well.

    An index on transactions(personnumber, itemnumber, transactionnumber, itemdate) might help this run much faster.