I have two tables: Files
and FileLogs
and I'm trying to get rid of the FileLogs
table and store some data from it inside the Files
table.
Files
id | lastStatusId | meta |
---|---|---|
187 | 101 | NULL |
188 | 101 | { "foo": "bar" } |
FileLogs
id | fileId | statusId | createdAt | meta |
---|---|---|---|---|
1 | 187 | 101 | 2019-06-18 19:43:19 | NULL |
2 | 187 | 102 | 2019-06-18 19:43:22 | NULL |
3 | 187 | 202 | 2019-06-18 19:45:32 | { "foo2": "bar2" } |
I already added the lastStatusId
column to the Files table (defaults to 101).
To get the latest statusId
for every file from the logs into the latestStatusId
I run this query and it works great:
UPDATE
`Files` f1,
(
SELECT
f.id as fileId,
fl.statusId,
fl.meta
FROM
(
SELECT
MAX(id) as id,
fileId
FROM
`FileLogs`
GROUP BY
fileId
) AS latestIds
LEFT JOIN `FileLogs` AS fl ON latestIds.id = fl.id
LEFT JOIN `Files` AS f ON f.id = fl.fileId
ORDER BY
fl.id ASC
) temp1
SET
f1.lastStatusId = temp1.statusId
WHERE
f1.id = temp1.fileId;
I thought I could use a similar query to go over all FileLogs and write the status into the Files' meta
field like so:
UPDATE
`Files` f1,
(
SELECT
f.id as fileId,
fl.statusId,
fl.meta,
fl.createdAt
FROM
`FileLogs` AS fl
LEFT JOIN `Files` AS f ON f.id = fl.fileId
) temp1
SET
f1.meta = JSON_MERGE_PATCH(
COALESCE(f1.meta, '{}'),
JSON_OBJECT(
'statusInfo',
JSON_OBJECT(
CONCAT('', temp1.`statusId`),
JSON_OBJECT(
'createdAt',
temp1.createdAt
)
)
)
)
WHERE
f1.id = temp1.fileId;
Result of the subquery is something like this:
fileId statusId meta createdAt
187 101 NULL 2019-06-18 19:43:19
187 102 NULL 2019-06-18 19:43:22
187 202 {"foo": "bar"} 2019-06-18 19:43:26
But the SET is only run for the first f1.id = temp1.fileId
and not the second or third (which kind of makes sense) - and the result looks like this:
File id:187
meta
{
"statusInfo": {
"101": {
"createdAt": "2019-06-18 19:43:19.000000"
}
}
}
You can pre-aggregate the JSON in the subquery. Here's a demo:
select f.id,
json_objectagg(l.statusId,
json_object(
'createdAt', l.createdAt,
'meta', l.meta
)) as statusInfo
from Files as f
join FileLogs as l on f.id = l.fileId
group by f.id;
Result given your example data:
id: 187
statusInfo: {
"101": {
"meta": null,
"createdAt": "2019-06-18 19:43:19.000000"
},
"102": {
"meta": null,
"createdAt": "2019-06-18 19:43:22.000000"
},
"202": {
"meta": {
"foo2": "bar2"
},
"createdAt": "2019-06-18 19:45:32.000000"
}
}
Now that this resembles the JSON structure you want, we can wrap this into an UPDATE:
update Files
join (
select fileId,
json_objectagg(statusId,
json_object(
'createdAt', createdAt,
'meta', meta
)
) as statusInfo
from FileLogs
group by fileId
) as f on Files.id = f.fileId
set meta = json_object('statusInfo', f.statusInfo);
However, you should think about the implications of doing this.
Storing data in JSON takes more space than storing the same data in rows of FileLogs
.
Adding subsequent information to the File.meta
JSON document can be done with JSON merging, but that will certainly take more careful thought than simply INSERT a new row to FileLogs
.
How will you need to query the statusInfo in the future? If you need to search for entries with certain status, or which one is the most recent, or which one had a status entry on a given date, this could be tricky to search the JSON document, whereas it is much easier if you keep the status info in rows of FileLogs
. You should make sure you know how to do all the queries you need to do before you commit to using JSON.