this is the continue question from find out time difference for every user in condition mysql 5.7
this is my fiddle https://dbfiddle.uk/?rdbms=mysql_5.7&fiddle=31b3be9d1e2444eb0b32c262176aa4b4
i have this table
CREATE TABLE test (
ID INT,
user_id INT,
createdAt DATE,
status_id INT
);
INSERT INTO test VALUES
(1, 13, '2020-01-01', 8),
(2, 13, '2020-01-03', 8),
(3, 13, '2020-01-06', 8),
(4, 13, '2020-01-02', 7),
(5, 13, '2020-01-03', 7),
(6, 14, '2020-03-03', 8),
(7, 13, '2020-03-04', 4),
(8, 15, '2020-04-04', 7),
(9, 14, '2020-03-02', 6),
(10, 14, '2020-03-10', 5),
(11, 13, '2020-04-10', 8);
select * from test where status_id != 7
order by createdAt;
+----+---------+------------+-----------+
| ID | user_id | createdAt | status_id |
+----+---------+------------+-----------+
| 1 | 13 | 2020-01-01 | 8 |
| 2 | 13 | 2020-01-03 | 8 |
| 3 | 13 | 2020-01-06 | 8 |
| 9 | 14 | 2020-03-02 | 6 |
| 6 | 14 | 2020-03-03 | 8 |
| 7 | 13 | 2020-03-04 | 4 |
| 10 | 14 | 2020-03-10 | 5 |
+----+---------+------------+-----------+
the id is the id of the transaction, user_Id is the id of the users who doing the transaction, createdAt are the date transaction happen, status_id is the status for the transaction (if the status_Id is 7, then the transaction are denied or not approval).
so on this case, I want to find out time difference for every approval transaction on every repeat users on time range between '2020-02-01' until '2020-04-01', repeat users are the users who doing transaction before the end of the time range, and at least doing 1 transaction again in the time range, on this case, users are doing approval transaction before '2020-04-01' and at least doing 1 more approval transaction again in between '2020-02-01' and '2020-04-01'.
for that problem i used this query based on answers from @Akina
-- Get pairs (current transaction, previous transaction) for these users
SELECT t1.user_id,
t1.createdAt,
t2.createdAt,
DATEDIFF(t2.createdAt, t1.createdAt) diff
-- table for a transaction
FROM test t1
-- table for prev. transaction
JOIN test t2 ON t1.user_id = t2.user_id
AND t1.createdAt < t2.createdAt
AND 7 NOT IN (t1.status_id, t2.status_id)
-- get data only for users from prev. query
JOIN (SELECT t3.user_id
FROM test t3
WHERE t3.status_id != 7
GROUP BY t3.user_id
HAVING SUM(t3.createdAt < '2020-04-01') > 1
AND SUM(t3.createdAt BETWEEN '2020-02-01' AND '2020-04-01')) t4 ON t1.user_id = t4.user_id
-- check that there is no approved transaction between selected transactions
WHERE NOT EXISTS (SELECT NULL
FROM test t5
WHERE t1.user_id = t5.user_id
AND t5.status_id != 7
AND t1.createdAt < t5.createdAt
AND t5.createdAt < t2.createdAt)
the output table was like this
+----------+------------+------------+------+
| user_id | createdAt | createdAt | diff |
+----------+------------+------------+------+
| 13 | 2020-01-01 | 2020-01-03 | 2 |
| 13 | 2020-01-03 | 2020-01-06 | 3 |
| 14 | 2020-03-02 | 2020-03-03 | 1 |
| 13 | 2020-01-06 | 2020-03-04 | 58 |
| 14 | 2020-03-03 | 2020-03-10 | 7 |
+----------+------------+------------+------+
the problem is, this query count time difference in time range ('2020-02-01' until '2020-04-01') for each users, and count time difference also before the time range (see users_id 13, that users also counting time difference in date '2020-01-01' until '2020-01-03'). what i want is, if the users had a transaction before time range, i want counting only his users_id last transaction before time range (on this case, users_id 13 who i want to counting just only time difference in '2020-01-06' until '2020-03-04' because in 06 january 2020 is the date when the users last transaction before time range. so on that case, the expected results was like this :
+---------+------------+------------+------+
| user_id | createdAt | createdAt | diff |
+---------+------------+------------+------+
| 14 | 2020-03-02 | 2020-03-03 | 1 |
| 13 | 2020-01-06 | 2020-03-04 | 58 |
| 14 | 2020-03-03 | 2020-03-10 | 7 |
+---------+------------+------------+------+
I think that you only have to exclude every transaction, that has ended before the start of the time frame BETWEEN '2020-02-01' AND '2020-04-01
, because they don't interest.
As you are already excluding everything that is beyond the time frame
Still you know when you enter the data when a transaction starts and ends and so you should mark the rows that belong together with in an extra column that would make your query much simpler in addition to your status that is not usable, because it repeat itself
SELECT t1.user_id, t1.createdAt, t2.createdAt createcompare, DATEDIFF(t2.createdAt, t1.createdAt) diff -- table for a transaction FROM test t1 -- table for prev. transaction JOIN test t2 ON t1.user_id = t2.user_id AND t1.createdAt < t2.createdAt AND 7 NOT IN (t1.status_id, t2.status_id) JOIN (SELECT t3.user_id FROM test t3 WHERE t3.status_id != 7 GROUP BY t3.user_id HAVING SUM(t3.createdAt < '2020-04-01') > 1 AND SUM(t3.createdAt BETWEEN '2020-02-01' AND '2020-04-01')) t4 ON t1.user_id = t4.user_id WHERE NOT EXISTS (SELECT NULL FROM test t5 WHERE t1.user_id = t5.user_id AND t5.status_id != 7 AND t1.createdAt < t5.createdAt AND t5.createdAt < t2.createdAt) HAViNG createcompare > '2020-02-01'
user_id | createdAt | cretecompare | diff ------: | :--------- | :----------- | ---: 14 | 2020-03-02 | 2020-03-03 | 1 13 | 2020-01-06 | 2020-03-04 | 58 14 | 2020-03-03 | 2020-03-10 | 7 13 | 2020-03-04 | 2020-04-10 | 37
db<>fiddle here
UPDATE:
That actually makes more sense now
SELECT t1.user_id, t1.createdAt cretecompare1, t2.createdAt cretecompare2, DATEDIFF(t2.createdAt, t1.createdAt) diff -- table for a transaction FROM test t1 -- table for prev. transaction JOIN test t2 ON t1.user_id = t2.user_id AND t1.createdAt < t2.createdAt AND 7 NOT IN (t1.status_id, t2.status_id) JOIN (SELECT t3.user_id FROM test t3 WHERE t3.status_id != 7 GROUP BY t3.user_id HAVING SUM(t3.createdAt < '2020-04-01') > 1 AND SUM(t3.createdAt BETWEEN '2020-02-01' AND '2020-04-01')) t4 ON t1.user_id = t4.user_id WHERE NOT EXISTS (SELECT NULL FROM test t5 WHERE t1.user_id = t5.user_id AND t5.status_id != 7 AND t1.createdAt < t5.createdAt AND t5.createdAt < t2.createdAt) HAViNG cretecompare2 BETWEEN '2020-02-01' AND '2020-04-01'
user_id | cretecompare1 | cretecompare2 | diff ------: | :------------ | :------------ | ---: 14 | 2020-03-02 | 2020-03-03 | 1 13 | 2020-01-06 | 2020-03-04 | 58 14 | 2020-03-03 | 2020-03-10 | 7
db<>fiddle here