Search code examples
postgresqldaterepeatpercentage

How do I get PostgreSQL to recognize repeating dates for a mathematical operation?


Very new to SQL querying. Using PostgreSQL.

I am trying to build a query that tells me what percentage of the time a unique customer id makes multiple transactions on the same day.

I have a query built that gets me the customer ids and transaction dates (if there are multiple on the same day, the date repeats.

Below is my query..

SELECT customer.customer_id, rental_date::date FROM customer
FULL OUTER JOIN rental
ON customer.customer_id = rental.customer_id
FULL OUTER JOIN inventory
ON rental.inventory_id = inventory.inventory_id
FULL OUTER JOIN film
ON inventory.film_id = film.film_id
ORDER BY customer.customer_id, rental_date

Update:

Query now reads:

SELECT customer.customer_id, rental_date::date, COUNT (*)
  FROM customer
  JOIN rental ON customer.customer_id = rental.customer_id
  JOIN inventory ON rental.inventory_id = inventory.inventory_id
  JOIN film ON inventory.film_id = film.film_id
GROUP BY customer.customer_id, rental_date
ORDER BY customer.customer_id, rental_date

Output:

+-------------+-------------+-------+
| customer_id | rental_date | count |
+-------------+-------------+-------+
| 1 | 2005-05-25 | 1 |
| 1 | 2005-05-28 | 1 |
| 1 | 2005-06-15 | 1 |
| 1 | 2005-06-15 | 1 |
| 1 | 2005-06-15 | 1 |
| 2 | 2005-06-16 | 1 |
+-------------+-------------+-------+

Desired output:

+-------------+-------------+-------+
| customer_id | rental_date | count |
+-------------+-------------+-------+
| 1 | 2005-05-25 | 1 |
| 1 | 2005-05-28 | 1 |
| 1 | 2005-06-15 | 3 |
| 2 | 2005-06-16 | 1 |
+-------------+-------------+-------+


Solution

  • What you are looking for is count and having. Count will get you the number of purchases by day and Having can be used to eliminate those with 0 or 1 purchases on a given day.

    select customer.customer_id, rental_date, count(*) 
      from customer
      join rental    on customer.customer_id = rental.customer_id
      join inventory on rental.inventory_id  = inventory.inventory_id
      join film      on inventory.film_id    = film.film_id
     group by customer.customer_id, rental_date 
    having count(*) > 1
     order by customer.customer_id, rental_date ;
    

    Also I doubt you want full outer join. That returns all rows from both the joined tables even when none exist in the other. I change it to an inner join (you only want customers that have rentals and also inventory that also have rentals. Even though now the having would eliminate the extras. Try removing the having clause then run with both full and again inner joins and see the difference.