I'm having trouble generating Profit & Loss reports in an accounting system.
Each general journal entry has an amount, a source account, and a destination account. Here is a simplified schema, and some sample data:
CREATE TABLE `sa_general_journal` (
`ID` int(10) unsigned NOT NULL AUTO_INCREMENT,
`Date` timestamp NOT NULL DEFAULT current_timestamp(),
`Item` varchar(1024) NOT NULL DEFAULT '',
`Amount` decimal(9,2) NOT NULL DEFAULT 0.00,
`Source` int(10) unsigned NOT NULL,
`Destination` int(10) unsigned NOT NULL,
PRIMARY KEY (`ID`),
KEY `Date` (`Date`),
KEY `Source` (`Source`),
KEY `Destination` (`Destination`),
CONSTRAINT `sa_credit-account` FOREIGN KEY (`Destination`) REFERENCES `sa_accounts` (`ID`),
CONSTRAINT `sa_debit-account` FOREIGN KEY (`Source`) REFERENCES `sa_accounts` (`ID`)
) ENGINE=InnoDB AUTO_INCREMENT=21561 DEFAULT CHARSET=utf8;
CREATE TABLE `sa_accounts` (
`ID` int(10) unsigned NOT NULL,
`Name` varchar(255) NOT NULL,
`Type` enum('Asset','Liability','Income','Expense'),
`Report` enum('BS','PL'), -- for "Balance Sheet" or "Profit & Loss"
PRIMARY KEY (`ID`),
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
INSERT INTO sa_account (`ID`, `Name`, `Type`, `Report`)
VALUES (1009999, "Test chequing account", "Asset", "BS"),
(4059999, "Test Income account", "Income", "PL"),
(5059999, "Test Expense account", "Expense", "PL");
INSERT INTO sa_general_journal (`ID`, `Date`, `Item`, `Amount`, `Source`, `Destination`)
VALUES (NULL, "2020-05-03", "Test income transaction", 10.10, 4059999, 1009999),
(NULL, "2020-05-03", "Test expense transaction", 1.01, 1009999, 5059999);
This is $10.10 of income, deposited in the chequing account, and an expense of $1.01 that comes out of the chequing account.
To obtain a balance for a Profit & Loss statement, one needs to sum all the Amount
entries for each occurrence of an account in the Source
column, and then subtract all the Amount
entries where that account is in the Destination
column.
Expected result would be:
<table>
<th>ID</th><th>Name</th><th>Debits</th><th>Credits</th><th>Net</th></tr>
<tr><td>1009999</td><td>Test chequing account</td><td>-1.01</td><td>10.10</td><td>9.09</td></tr>
<tr><td>4059999</td><td>Test income transaction</td><td>-10.10</td><td><i>NULL</i></td><td>-10.10</td></tr>
<tr><td>5059999</td><td>Test expense transaction</td><td><i>NULL</i></td><td>1.01</td><td>1.01</td></tr>
</table>
My first approach was somewhat naive, to query the sa_general_journal
table joined with the sa_accounts
table, selected by either the Source
and Destination
columns, using an IF function to negate the Amount
if the Destination
contained the account of interest. I do this successfully when querying an individual account using a prepared statement:
SELECT
DATE_FORMAT(exp.Date, '%Y') AS `Year`,
AVG(exp.Amount) AS `Avg`,
SUM(IF(Source = ?, 0 - exp.Amount, NULL)) AS `Debits`,
SUM(IF(Destination = ?, exp.Amount, NULL)) AS `Credits`,
SUM(IF(Source = ?, 0 - exp.Amount, exp.Amount)) AS `Net`
FROM sa_general_journal exp
LEFT JOIN sa_accounts Destination ON exp.Destination = Destination.ID
WHERE Destination = ?
OR Source = ?
GROUP BY `Year`
where all four placeholders hold the same account ID.
But this fails when GROUPing by account ID without the WHERE clause, to get a listing of all account balances — substituting a dynamic account ID in place of the static account ID in a placeholder never appears to hit the "0 - exp.Amount" code… Duh! The query is selected as a set, not steps in a procedure. I get that.
Making the SUM statements subqueries works, but it is dreadfully slow, apparently doing the three subqueries for each of tens of thousands of records!
So, I thought I could factor the subqueries with a couple Common Table Expressions, but this does not appear to work properly, either, as it is still seems to be simply returning SUM(Amount
), without subtracting the Amount
s that are Destination
s rather than Source
s.
WITH
source1 AS (SELECT
src.ID,
src.Name,
SUM(Amount) Amount
FROM sa_general_journal gj
LEFT JOIN sa_accounts src ON gj.`Source` = src.ID
WHERE src.Report = "PL" -- AND YEAR(`Date`) = 2019
GROUP BY src.ID),
destination1 AS (SELECT
dst.ID,
dst.Name,
SUM(0-Amount) Amount
FROM sa_general_journal gj
LEFT JOIN sa_accounts dst ON gj.`Destination` = dst.ID
WHERE dst.Report = "PL" -- AND YEAR(`Date`) = 2019
GROUP BY dst.ID)
SELECT ID, Name, sum(Amount)
FROM source1
UNION ALL
SELECT ID, Name, sum(Amount)
FROM destination1
GROUP BY ID
I'm guessing I'm making some silly assumption, or doing something stupid, so any advice is appreciated!
Without sample data it's hard to be 100% certain, but this query should give the results you want. It uses a UNION
query to split transactions into their Source
and Destination
accounts, and then uses conditional aggregation to SUM
the transactions for each account.
WITH CTE AS (
SELECT Source AS account, 'debits' AS type, -Amount AS Amount
FROM sa_general_journal
UNION ALL
SELECT Destination, 'credits', Amount
FROM sa_general_journal
)
SELECT acc.ID, acc.Name,
SUM(CASE WHEN CTE.type = 'debits' THEN Amount END) AS Debits,
SUM(CASE WHEN CTE.type = 'credits' THEN Amount END) AS Credits,
SUM(Amount) AS Net
FROM CTE
JOIN sa_accounts acc ON CTE.account = acc.ID
GROUP BY acc.ID, acc.Name
Output (for your sample data):
ID Name Debits Credits Net
4059999 Test Income account -10.1 null -10.1
1009999 Test chequing account -1.01 10.1 9.09
5059999 Test Income account null 1.01 1.01