Search code examples
mysqlcommon-table-expressionaccounting

Need to negate a column based on another column, as part of an aggregate total


I'm having trouble generating Profit & Loss reports in an accounting system.

Each general journal entry has an amount, a source account, and a destination account. Here is a simplified schema, and some sample data:

CREATE TABLE `sa_general_journal` (
  `ID` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `Date` timestamp NOT NULL DEFAULT current_timestamp(),
  `Item` varchar(1024) NOT NULL DEFAULT '',
  `Amount` decimal(9,2) NOT NULL DEFAULT 0.00,
  `Source` int(10) unsigned NOT NULL,
  `Destination` int(10) unsigned NOT NULL,
  PRIMARY KEY (`ID`),
  KEY `Date` (`Date`),
  KEY `Source` (`Source`),
  KEY `Destination` (`Destination`),
  CONSTRAINT `sa_credit-account` FOREIGN KEY (`Destination`) REFERENCES `sa_accounts` (`ID`),
  CONSTRAINT `sa_debit-account` FOREIGN KEY (`Source`) REFERENCES `sa_accounts` (`ID`)
) ENGINE=InnoDB AUTO_INCREMENT=21561 DEFAULT CHARSET=utf8;

CREATE TABLE `sa_accounts` (
  `ID` int(10) unsigned NOT NULL,
  `Name` varchar(255) NOT NULL,
  `Type` enum('Asset','Liability','Income','Expense'),
  `Report` enum('BS','PL'), -- for "Balance Sheet" or "Profit & Loss"
  PRIMARY KEY (`ID`),
) ENGINE=InnoDB DEFAULT CHARSET=utf8;


INSERT INTO sa_account (`ID`, `Name`, `Type`, `Report`)
  VALUES (1009999, "Test chequing account", "Asset", "BS"),
         (4059999, "Test Income account", "Income", "PL"),
         (5059999, "Test Expense account", "Expense", "PL");
INSERT INTO sa_general_journal (`ID`, `Date`, `Item`, `Amount`, `Source`, `Destination`)
  VALUES (NULL, "2020-05-03", "Test income transaction", 10.10, 4059999, 1009999),
 (NULL, "2020-05-03", "Test expense transaction", 1.01, 1009999, 5059999);

This is $10.10 of income, deposited in the chequing account, and an expense of $1.01 that comes out of the chequing account.

To obtain a balance for a Profit & Loss statement, one needs to sum all the Amount entries for each occurrence of an account in the Source column, and then subtract all the Amount entries where that account is in the Destination column.

Expected result would be:

<table>
<th>ID</th><th>Name</th><th>Debits</th><th>Credits</th><th>Net</th></tr>
<tr><td>1009999</td><td>Test chequing account</td><td>-1.01</td><td>10.10</td><td>9.09</td></tr>
<tr><td>4059999</td><td>Test income transaction</td><td>-10.10</td><td><i>NULL</i></td><td>-10.10</td></tr>
<tr><td>5059999</td><td>Test expense transaction</td><td><i>NULL</i></td><td>1.01</td><td>1.01</td></tr>
</table>

My first approach was somewhat naive, to query the sa_general_journal table joined with the sa_accounts table, selected by either the Source and Destination columns, using an IF function to negate the Amount if the Destination contained the account of interest. I do this successfully when querying an individual account using a prepared statement:

SELECT
  DATE_FORMAT(exp.Date, '%Y') AS `Year`,
  AVG(exp.Amount) AS `Avg`,
  SUM(IF(Source = ?, 0 - exp.Amount, NULL)) AS `Debits`,
  SUM(IF(Destination = ?, exp.Amount, NULL)) AS `Credits`,
  SUM(IF(Source = ?, 0 - exp.Amount, exp.Amount)) AS `Net`
FROM sa_general_journal exp
  LEFT JOIN sa_accounts Destination ON exp.Destination = Destination.ID
WHERE Destination = ?
  OR Source = ?
GROUP BY `Year`

where all four placeholders hold the same account ID.

But this fails when GROUPing by account ID without the WHERE clause, to get a listing of all account balances — substituting a dynamic account ID in place of the static account ID in a placeholder never appears to hit the "0 - exp.Amount" code… Duh! The query is selected as a set, not steps in a procedure. I get that.

Making the SUM statements subqueries works, but it is dreadfully slow, apparently doing the three subqueries for each of tens of thousands of records!

So, I thought I could factor the subqueries with a couple Common Table Expressions, but this does not appear to work properly, either, as it is still seems to be simply returning SUM(Amount), without subtracting the Amounts that are Destinations rather than Sources.

WITH
 source1 AS (SELECT
  src.ID,
  src.Name,
  SUM(Amount) Amount
FROM sa_general_journal gj
  LEFT JOIN sa_accounts src ON gj.`Source` = src.ID
WHERE src.Report = "PL" -- AND YEAR(`Date`) = 2019
GROUP BY src.ID),
 destination1 AS (SELECT
  dst.ID,
  dst.Name,
  SUM(0-Amount) Amount
FROM sa_general_journal gj
  LEFT JOIN sa_accounts dst ON gj.`Destination` = dst.ID
WHERE dst.Report = "PL" --  AND YEAR(`Date`) = 2019
GROUP BY dst.ID)
SELECT ID, Name, sum(Amount)
FROM source1
UNION ALL
SELECT ID, Name, sum(Amount)
FROM destination1
GROUP BY ID

I'm guessing I'm making some silly assumption, or doing something stupid, so any advice is appreciated!


Solution

  • Without sample data it's hard to be 100% certain, but this query should give the results you want. It uses a UNION query to split transactions into their Source and Destination accounts, and then uses conditional aggregation to SUM the transactions for each account.

    WITH CTE AS (
      SELECT Source AS account, 'debits' AS type, -Amount AS Amount
      FROM sa_general_journal
      UNION ALL 
      SELECT Destination, 'credits', Amount
      FROM sa_general_journal
    )
    SELECT acc.ID, acc.Name, 
           SUM(CASE WHEN CTE.type = 'debits'  THEN Amount END) AS Debits,
           SUM(CASE WHEN CTE.type = 'credits' THEN Amount END) AS Credits,
           SUM(Amount) AS Net
    FROM CTE
    JOIN sa_accounts acc ON CTE.account = acc.ID
    GROUP BY acc.ID, acc.Name
    

    Output (for your sample data):

    ID          Name                    Debits  Credits Net
    4059999     Test Income account     -10.1   null    -10.1
    1009999     Test chequing account   -1.01   10.1    9.09
    5059999     Test Income account     null    1.01    1.01
    

    Demo on dbfiddle