I am working on the backend of an application that needs to protect an external API from too many requests per user per month. So I need to keep track of number of requests from each user. I have a lot of experience with concurrent programming but almost no experience with db management or MySQL,
So, suppose I want to execute the equivalent of the following pseudocode, where I mix SQL statements with application-level logic, and where lookups
is a table:
mutex mtx;
set @userid = 'usrid1';
set @date = CURDATE();
set @month = CONCAT_WS('-', YEAR(@date), MONTH(@date));
mtx.lock()
select counter from lookups where userid=@userid and month=@month;
if returned rows == 0:
insert into lookups set month=@month, userid=@userid, counter=1;
else:
update lookups set counter=counter+1;
mtx.unlock()
Except, of course, I don't have access to that mutex. At first I thought it would be enough to just wrap the whole thing inside a transaction, but upon closer inspection of the MySQL reference it seems that may not be enough to avoid possible race conditions, such as two threads/processes reading the same counter
value. Is it good enough then, in mysql with default settings, to do the following:
set @userid = 'usrid1';
set @date = CURDATE();
set @month = CONCAT_WS('-', YEAR(@date), MONTH(@date));
start transaction;
select counter from lookups where userid=@userid and month=@month for update;
if returned rows == 0:
insert into lookups set month=@month, userid=@userid, counter=1;
else:
update lookups set counter=counter+1;
commit;
From what I can glean from the reference, it looks like it should be enough, and it should cause neither race conditions nor deadlocks, but the reference is long winded and complex, so I wanted to ask here to be sure. Performance isn't important. The reference states that MySQL's default isolation level is REPEATABLE READ
.
I suggest this solution:
create table lookups (userid varchar(20), yearmonth date, counter int, primary key (userid, yearmonth));
insert into lookups set userid = 'usrid1',
yearmonth = date_format(curdate(), '%Y-%m-01'),
counter = last_insert_id(1)
on duplicate key update
counter = last_insert_id(counter + 1);
select last_insert_id(); -- returns the new value, whether 1 or the updated value.
This means you don't have to check if a row exists, it will either insert it or update it atomically.
The last_insert_id(<expression>)
trick is documented at the end of the entry for that function: https://dev.mysql.com/doc/refman/8.0/en/information-functions.html#function_last-insert-id