Search code examples
mysqlgreatest-n-per-group

MYSQL GROUP BY with WHERE CLAUSE


I've looked through the greatest-n-group-by group and can't find this answer and can't make the JOIN solutions work with the WHERE CLAUSE. Here's what I've got.

t1 table:

ID  Product_Name    Quantity    pharmacyShort   Copay_Amount
581284  AMLODIPINE 10 MG    30  WALGREENS   21.07
581283  AMLODIPINE 10 MG    30  CVS 36.61
581282  AMLODIPINE 10 MG    28  RITE AID    30.98
581280  AMLODIPINE 10 MG    60  WALGREENS   50.65
581279  AMLODIPINE 10 MG    30  CVS 29.78
581278  AMLODIPINE 10 MG    30  RITE AID    14.28
581277  AMLODIPINE 10 MG    180 WALGREENS   33.83
581276  AMLODIPINE 10 MG    15  CVS 18.33
581275  AMLODIPINE 10 MG    10  RITE AID    45.93
581274  AMLODIPINE 10 MG    30  PUBLIX  33.75

I also have a priority table, call it t2, on which I'd like the output prioritized (sorted) by:

id  pharmacyShort   COUNT
1   CVS 100
2   RITE AID    99
3   TARGET  98
4   WALGREENS   97
5   WALMART 96
6   KMART   95
7   KROGER  94
8   PUBLIX  93

MYSQL:

SELECT t1.pharmacyShort , t1.Copay_Amount 
FROM `t1` 
INNER JOIN `t2` 
ON t1.pharmacyShort = t2.pharmacyShort 
WHERE t1.Product_Name = 'AMLODIPINE 10 MG' 
AND t1.Quantity = '30' 
AND t1.ID > 555000 
GROUP BY t1.pharmacyShort 
ORDER BY t2.COUNT DESC LIMIT 30

I have the ID > 555000 to keep the query time low. Also, I'm trying to show the most recent prices. Perhaps that can be done with an ORDER BY but the challenge is that some drugs have more claims while others have much less which means I have to go further back in time to get a price.

What I'm hoping to get is this (I actually only need the pharmacyShort and Copay_Amount columns on the output, but left the other columns for clarity here):

581283  AMLODIPINE 10 MG    30  CVS 36.61
581284  AMLODIPINE 10 MG    30  WALGREENS   21.07
581278  AMLODIPINE 10 MG    30  RITE AID    14.28
581274  AMLODIPINE 10 MG    30  PUBLIX  33.75

The output should pick only the CVS with ID 581283 and show just the one claim from CVS but instead (as I've learned) GROUP BY is picking the older claim, 581279, like this:

581279  AMLODIPINE 10 MG    30  CVS 29.78
581284  AMLODIPINE 10 MG    30  WALGREENS   21.07
581278  AMLODIPINE 10 MG    30  RITE AID    14.28
581274  AMLODIPINE 10 MG    30  PUBLIX  33.75

If it makes it easier, I can live without the priority table. I've also considered trying to take care of this on the PHP side by some kind of array sort, but I think MYSQL would be faster.

Many, many thanks and good karma to any and all that can help.

UPDATE:

Based on the answers I've gotten, here is where I am:

SELECT a.pharmacyShort, a.copay_amount
FROM `t1` a
JOIN (SELECT MAX(ID) as maxid, pharmacyShort
FROM `t1`
WHERE Product_Name = 'AMLODIPINE 10 MG' 
      AND Quantity = '30' 
      GROUP BY pharmacyShort) AS maxt1
  ON a.pharmacyShort = maxt1.pharmacyShort AND a.id = maxt1.maxid
join t2 b ON a.pharmacyShort = b.pharmacyShort
ORDER BY b.count DESC
LIMIT 20

I removed from the WHERE clause the id > 550000 because using the MAX(ID) I don't think I'll need it.

But I'm getting a zero result set. So where am I going wrong?


Solution

  • Use the Joining with simple group-identifier, max-value-in-group Sub-query method at SQL select only rows with max value on a column and put the conditions in the sub-query.

    Then join this with t2 to get the count column for ordering.

    SELECT t1.pharmacyShort, t1.copay_amount
    FROM t1
    JOIN (SELECT pharmacyShort, MAX(id) AS maxid
          FROM t1
          WHERE Product_Name = 'AMLODIPINE 10 MG' 
          AND Quantity = '30' 
          AND ID > 555000
          GROUP BY pharmacyShort) AS maxt1
        ON t1.pharmacyShort = maxt1.pharmacyShort AND t1.id = maxt1.maxid
    JOIN t2 on t1.pharmacyShort = t2.pharmacyShort
    ORDER BY t2.count DESC
    LIMIT 20