Search code examples
phppostgresqltriggersauto-increment

Generate gap free numbers with database trigger


Together with my team, I am working on a functionality to generate invoice numbers. The requirements says that:

  • there should be no gaps between invoice numbers
  • the numbers should start from 0 every year (the together with the year we will have a unique key)
  • the invoice numbers should grow accordinlgy to the time of the creation of the invoices

We are using php and postgres. We tought to implement this in the following way:

  • each time a new invoice is persisted on the database we use a BEFORE INSERT trigger
  • the trigger executes a function that retrieves a new value from a postgres sequence and writes it on the invoice as its number

Considering that multiple invoices could be created during the same transaction, my question is: is this a sufficiently safe approach? What are its flaws? How would you suggest to improve it?


Solution

  • Introduction

    I believe the most crucial point here is:

    • there should be no gaps between invoice numbers

    In this case you cannot use a squence and an auto-increment field (as others propose in the comments). Auto-increment field use sequence under the hood and nextval(regclass) function increments sequence's counter no matter if transaction succeeded or failed (you point that out by yourself).


    Update:

    What I mean is you shouldn't use sequences at all, especially solution proposed by you doesn't eliminates gap possibility. Your trigger gets new sequence value but INSERT could still failed.


    Sequences works this way because they mainly meant to be used for PRIMARY KEYs and OIDs values generation where uniqueness and non-blocking mechanism is ultimate goal and gaps between values are really no big deal.

    In your case however the priorities may be different, but there are couple things to consider.

    Simple solution

    First possible solution to your problem could be returning new number as maximum value of currently existing ones. It can be done in your trigger:

    NEW.invoice_number =
            (SELECT foo.invoice_number
             FROM invoices foo
             WHERE foo._year = NEW._year
             ORDER BY foo.invoice_number DESC NULLS LAST LIMIT 1
            ); /*query 1*/
    

    This query could use your composite UNIQUE INDEX if it was created with "proper" syntax and columns order which would be the "year" column in the first place ex.:

    CREATE UNIQUE INDEX invoice_number_unique
    ON invoices (_year, invoice_number DESC NULLS LAST);
    

    In PostgreSQL UNIQUE CONSTRAINTs are implemented simply as UNIQUE INDEXes so most of the times there no difference which command you will use. However using that particular syntax presented above, makes possible to define order on that index. It's really nice trick which makes /*query 1*/ quicker than simple SELECT max(invoice_number) FROM invoices WHERE _year = NEW.year if the invoice table gets bigger.

    This is simple solution but has one big drawback. There is possibility of race condition when two transactions try to insert invoice at the same time. Both could acquire the same max value and the UNIQUE CONSTRAINT will prevent the second one from committing. Despite that it could be sufficient in some small system with special insert policy.

    Better solution

    You may create table

    CREATE TABLE invoice_numbers(
       _year INTEGER NOT NULL PRIMARY KEY,
       next_number_within_year INTEGER
    );
    

    to store next possible number for certain year. Then, in AFTER INSERT trigger you could:

    1. Lock invoice_numbers that no other transaction could even read the number LOCK TABLE invoice_numbers IN ACCESS EXCLUSIVE;
    2. Get new invoice number new_invoice_number = (SELECT foo.next_number_within_year FROM invoice_numbers foo where foo._year = NEW.year);
    3. Update number value of new added invoice row
    4. Increment UPDATE invoice_numbers SET next_number_within_year = next_number_within_year + 1 WHERE _year = NEW._year;

    Because table lock is hold by the transaction to its commit, this probably should be the last trigger fired (read more about trigger execution order here)


    Update:

    Instead of locking whole table with LOCK command check link provided by Craig Ringer


    The drawback in this case is INSERT operation performance drop down --- only one transaction at the time can perform insert.