Search code examples
databasetransactions

what exactly happen when transaction commit but database crash


A client sends a transaction to database like this:

begin transaction
sql1
sql2
sql3
...
end transaction (commit)

From my point of view, the database will follow the steps below:

  • Step1: receive commands one by one
  • Step2: execute commands one by one in memory
  • Step3: save result from memory into disk (redo log)
  • Step4: add commit flag in log when receive commit command
  • Step5: response success ack to client

My questions:

  • Is the description above right and accurate?
  • If it's right and accurate, what will happen when that database crashes between step4 and step5? In this situation, transaction commits successly but client receives no commit success ack. So client can't tell whether transaction commits successly.

Solution

  • In Database , to prevent such problems, they will do this things during transactions as follows: Before making any changes to the database, transaction details are fully recorded in the transaction log. In PostgreSQL, this is called WAL (Write-Ahead Logging) I don't know other dbs.

    The commit process consists of two phases => Two-Phase Commit.

    Prepare Phase : The transaction is prepared, and all details are written to the transaction log. At this stage, the transaction is not yet considered complete.

    Commit Phase : In this phase, the transaction log is transaction is marked as commited, and changes are applied to the database asynchronously .

    If a crash occurs, there are two scenarios:

    If the crash happens during the Prepare Phase and the transaction has not been committed: After the database restarts, it checks the transaction log and rolls back the transaction based on the log.

    If the crash happens after the transaction is committed (Commit Phase): Upon restart, the database uses the transaction log to apply the changes to the database.

    The restart process varies across different databases:

    MySQL/SQL Server: These databases will not fully start until all incomplete transactions are rolled back. PostgreSQL: It identifies the transaction and starts rolling it back. New transactions simply ignore the incomplete ones. The dead data from incomplete transactions is cleaned up using Vacuum tool or deleted by manual process.