Search code examples
phpmysqltransactionswebserver

Concurrent database access


When the web server receives a request for my PHP script, I presume the server creates a dedicated process to run the script. If, before the script exits, another request to the same script comes, another process gets started -- am I correct, or the second request will be queued in the server, waiting for the first request to exit? (Question 1)

If the former is correct, i.e. the same script can run simultaneously in a different process, then they will try to access my database.

When I connect to the database in the script:

$DB = mysqli_connect("localhost", ...);

query it, conduct more or less lengthy calculations and update it, I don't want the contents of the database to be modified by another instance of a running script.

Question 2: Does it mean that since connecting to the database until closing it:

mysqli_close($DB);

the database is blocked for any access from other software components? If so, it effectively prevents the script instances from running concurrently.

UPDATE: @OllieJones kindly explained that the database was not blocked.

Let's consider the following scenario. The script in the first process discovers an eligible user in the Users table and starts preparing data to append for that user in the Counter table. At this moment the script in the other process preempts and deletes the user from the Users table and the associate data in the Counter table; it then gets preempted by the first script which writes the data for the user no more existing. These data become in the head-detached state, i.e. unaccessible.

How to prevent such a contention?


Solution

  • In modern web servers, there's a pool of processes (or possibly threads) handling requests from users. Concurrent requests to the same script can run concurrently. Each request-handler has its own connection to the DBMS (they're actually maintained in a pool, but that's a story for another day).

    The database is not blocked while individual request-handlers are using it, unless you block it explicitly by locking a table or doing a request like SELECT ... FOR UPDATE. For more information on this deep topic, read about transactions.

    Therefore, it's important to write your database queries in such a way that they won't interfere with each other. For example, if you need to learn the value of an auto-incremented column right after you insert a row, you should use LAST_INSERT_ID() or mysqli_insert_id() instead of trying to query the data base: another user may have inserted another row in the meantime.

    The system test discipline for scaled-up web sites usually involves a rigorous load test in order to shake out all this concurrency.

    If you're doing a bunch of work on a particular entity, in your case a User, you use a transaction.

    First you do

     BEGIN
    

    to start the transaction. Then you do

      SELECT whatever FROM User WHERE user_id = <<whatever>> FOR UPDATE
    

    to choose the user and mark that user's row as busy-being-updated. Then you do all the work you need to do to fill out various rows in various tables relating to that user.

    Finally you do

     COMMIT
    

    If you messed things up, or don't want to go through with the change, you do

    ROLLBACK
    

    and all your changes will be restored to their state right before the SELECT ... FOR UPDATE.

    Why does this work? Because if another client does the same SELECT .... FOR UPDATE, MySQL will delay that request until the first one either gets COMMIT or ROLLBACK.

    If another client works with a different userid, the operations may proceed concurrently.

    You need the InnoDB access method to use transactions: MyISAM doesn't support them.