I have a typical many-to-many relationship involving 3 tables:
user (id, name)
product (id, name, current_price, image, url)
user_product_mapping (user_id, product_id, my_purchase_price)
. Note that the mapping table has an extra column my_purchase_price
, as user can specify his own price.
I am building an API that allows bulk insert, e.g.
curl -d '{"products": [\
{"image":"http://example.com/images/testproduct", "name":"testproduct", "url":"http://example.com/products/test", "current_price":50, "my_purchase_price":40},\
{"image":"http://example.com/images/testproduct2","name":"testproduct2","url":"http://example.com/products/test2","current_price":100,"my_purchase_price":80} \
]}' \
-X POST http://example.com/users/1/products -H "Content-Type: application/json"
In this case, the user ID is 1 and there are 2 products to insert, with their price at 50, 100 respectively, and the user's specified price at 40, 80 respectively.
My question is: how can I effectively insert the data into the 3 tables (creating a user entry if none exists)? I would not want to insert 1 by 1. Hope I can insert them just in a sproc.
The problem is that I don't want duplicates in product
, user
, and product_user_mapping
tables. So if there is already such products in product table (URL is a unique key), then I will just update their prices. If there is already mappings in product_user_mapping
table, then only my_purchase_price will be updated. I basically would need 2 insert if not exist and update on duplicates, however the second insert update need to know the IDs of the first insert update and I cannot find an effective way to update my_purchase_price
during second insert/update. To make things more interesting, the API actually allows user to specify the user's name as well, meaning if user 1 does not exist in the user table, I will create an entry for user 1 in user table as well.
I purposely do not use the URL as the primary key in product table, as I think using the URL as a primary key is a bad idea in general (is it?). I have tried several ways, but none of them to my satisfaction:
Method 1:
INSERT INTO product ON DUPLICATE KEY (url) UPDATE price, image
INSERT INTO user_product_mapping ON DUPLICATE KEY UPDATE my_purchase_price
.Method 2:
INSERT INTO product ON DUPLICATE KEY (url) UPDATE price, image;
DELETE FROM user_product_mapping
WHERE user_id=1
AND product_id NOT IN (
SELECT product id
FROM product
WHERE url NOT IN (the urls passed from api)
);
INSERT INTO user_product_mapping;
I seem to have write at least 3 insert-update statements and may need a cursor as well. I have a feeling that I might have over complicated things.
In case someone comes across the same issue, my solution is to make the URL the primary key in the product table (not sure whether this is a good idea). Thus before I start all the insert operations, I know all the keys I have before hand. I then issue two insert-update-on-duplicate queries (note this step can be done in one SQL statement):
INSERT INTO product (url, name, image, current_price)
VALUES
('http://example.com/product/test', 'testproduct', 'http://example.com/images/testproduct', 10)
ON DUPLICATE KEY UPDATE
url=VALUES(url),
image=VALUES(image),
current_price=VALUES(current_price);
INSERT INTO user_product (user_id, product_url, my_purchase_price)
VALUES
(1, 'http://example.com/products/test', 8)
ON DUPLICATE KEY UPDATE
my_price=VALUES(my_price);
Remember to set user_id
and url
as composite primary key for the mapping table in order for the ON DUPLICATE KEY UPDATE
clause to apply
ALTER TABLE user_product ADD PRIMARY KEY (user_device_id, product_url);