I need to implement a timetravel view for prices in mysql. The base price table is this:
CREATE TABLE product_price (
product_id INT(11) NOT NULL,
date_valid TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
price DECIMAL(15,4) NOT NULL,
PRIMARY KEY(product_id,date_valid)
);
The idea is that as time passes I select the right valid prices I have entered in advance. Maybe the concept can be clearer later. I need to create a view so that for each product_id I get the latest price. After a while I have found the SELECT that does what I need:
SELECT * FROM (
SELECT product_id,price FROM product_pric
WHERE date_valid <= CURRENT_TIMESTAMP
ORDER BY date_valid DESC
) xx GROUP BY product_id;
In order to create the needed view I understood I cannot use the subselect and need to create one or more intermediate views. Like this:
CREATE VIEW v_product_price_time AS
SELECT product_id,price FROM product_pric
WHERE date_valid <= CURRENT_TIMESTAMP
ORDER BY date_valid DESC
;
CREATE VIEW v_product_price AS
SELECT * FROM v_product_price_time GROUP BY product_id;
What I then get is not the sameas the original query I've written. For example, I populate the table with just two rows:
INSERT INTO product_price (product_id,date_valid,price ) VALUES ( 1,'2013-01-01',41.40 );
INSERT INTO product_price (product_id,date_valid,price ) VALUES ( 1,'2013-01-03',42.0 );
The raw query returns the right data, (1,42.0), but querying the view doesn't. I always get (1,41.40).
Surely I am missing something as I don't know MySQL very well. With another opensource RDBMS I have already done similar stuff, but now I need to cope with MySQL v5.5 and have no way to change it. But the documentation and a few searches in the developers forums didn't lead me to a solution. Any idea on how to solve this? TIA.
Use this query, regardless of the fact that is in a view or not.
SELECT p1.* FROM (
SELECT * FROM product_price
WHERE date_valid <= CURRENT_TIMESTAMP
) p1
LEFT JOIN (
SELECT product_id, date_valid FROM product_price
WHERE date_valid <= CURRENT_TIMESTAMP
) p2
ON p1.product_id = p2.product_id AND p1.date_valid < p2.date_valid
WHERE p2.date_valid IS NULL
The query creates 2 derived tables, which is not quite efficient and is also a bit harder to read. You can try creating another view for that:
CREATE VIEW product_price_past_dates AS (
SELECT * FROM product_price
WHERE date_valid <= CURRENT_TIMESTAMP
);
And then rewriting the original query as:
SELECT p1.* FROM product_price_past_dates p1
LEFT JOIN product_price_past_dates p2
ON p1.product_id = p2.product_id AND p1.date_valid < p2.date_valid
WHERE p2.date_valid IS NULL
Then you can create the view on the query that uses the previous view:
CREATE VIEW v_product_price_time AS (
SELECT p1.* FROM product_price_past_dates p1
LEFT JOIN product_price_past_dates p2
ON p1.product_id = p2.product_id AND p1.date_valid < p2.date_valid
WHERE p2.date_valid IS NULL
);
And finish with the simplest query:
SELECT * FROM v_product_price_time;
Fiddle here.
Why isn't the GROUP BY not working: The error basically resides in an inappropriate use of a GROUP BY clause. The rule of thumb (although not 100% of the time true) would be to always use the same fields in the select as in the GROUP BY. Otherwise, MySQL will choose any value from the fields that are in the select and not in the GROUP BY.
For further information you should check the MySQL documentation. I think it is pretty clear.
Highly detailed explanation:
SELECT * FROM (
SELECT product_id,price FROM product_pric
WHERE date_valid <= CURRENT_TIMESTAMP
ORDER BY date_valid DESC
) xx GROUP BY product_id;
0 syntactical errors
1 semantical error
1 warning
Previous query should be fixed using one of the 2 basic greatest-n-per-group solutions. I've provided the shortest one which is the left join one.
CREATE VIEW v_product_price_time AS
SELECT product_id,price FROM product_pric
WHERE date_valid <= CURRENT_TIMESTAMP
ORDER BY date_valid DESC
;
0 syntactical errors
0 semantical errors
0 warnings
Perfectly valid query. No comments at all.
CREATE VIEW v_product_price AS
SELECT * FROM v_product_price_time GROUP BY product_id;
0 syntactical errors
1 semantical errors
0 warnings
So, you're basically comparing 2 unpredictable results and expecting to be the same. The funny thing is that the fact of comparing 2 unpredictable results is more error prone than just 1 unpredictable result. So consider yourself lucky of having that increase of chance in finding this error in your code. The guy who asked this question: How can I update the value of one field to the most often used value of another field?
will discover a funny surprise when he finds out that the query was not working as he expected to be. Furthermore, 3 out of 4 answers in there are not properly grouping by correctly and returning unpredictable results. Not to mention Bohemian's comment in which he states his code will always work
. So, congratulations Enzo, you've just divided by 0 :)
Hope this helps.