SQL: Select Running Total For Each Category In Transactional Table, Sorted By Date

I have a transactional table providing: an alphanumeric PK, a timestamp, a user_id, a in/out string column, and an amount column.

id          time                  user_id   io    amount
38hw        2019-10-18 18:35:09   2         in     1
nv49        2019-10-18 18:35:10   3         in    50
83ha        2019-10-18 18:35:11   5         in     2
ja03        2019-10-18 18:35:12   4         out    2
019c        2019-10-18 18:35:13   1         out   75
ac5r        2019-10-18 18:35:14   3         in    20
as30        2019-10-18 18:35:15   3         in     3
34ds        2019-10-18 18:35:16   4         in     7
12my        2019-10-18 18:35:17   2         in    50
dk20        2019-10-18 18:35:18   4         in    50
sk18        2019-10-18 18:35:19   1         in     7
am35        2019-10-18 18:35:20   2         in     3
mc92        2019-10-18 18:35:21   2         out    8
alov        2019-10-18 18:35:22   3         in     4
ap34        2019-10-18 18:35:23   1         out    6

I am trying to create another column that provides the running total for that user_id each time it shows up. These user_id's do not initially show up with a 0 amount, so that must be assumed the first time they show up.

I've considered its likely possible to do this with some helper columns. My thought process is something like this:

Create a column to indicate how many times the user_id shows up prior to the value in time column. Maybe called occurence_num column
Create a column that makes amount easier to work with, like (case when io='in' then amount else -1*amount end) as balance_adjust
Group by user_id and (on each row) sum() all the balance_adjust values where occurence_num is less than the current records.

I'm having a hard time testing these ideas out though. I'm working in a fairly large database, SQLite with 22 million rows. The table can be altered/updated as needed. It was stored this way in favor of keeping the ETL as simple as possible because there was a lot of data to pull and a lot of pages to pull from. My desired output would look something like this:

id          time                  user_id   io    amount    running_total
38hw        2019-10-18 18:35:09   2         in     1         1
nv49        2019-10-18 18:35:10   3         in    50        50
83ha        2019-10-18 18:35:11   5         in     2         2
ja03        2019-10-18 18:35:12   4         out    2        -2
019c        2019-10-18 18:35:13   1         out   75       -75
ac5r        2019-10-18 18:35:14   3         in    20        70
as30        2019-10-18 18:35:15   3         in     3        73
34ds        2019-10-18 18:35:16   4         in     7         5
12my        2019-10-18 18:35:17   2         in    50        51
dk20        2019-10-18 18:35:18   4         in    50        55
sk18        2019-10-18 18:35:19   1         in     7       -68
am35        2019-10-18 18:35:20   2         in     3        54
mc92        2019-10-18 18:35:21   2         out    8        46
alov        2019-10-18 18:35:22   3         in     4        77
ap34        2019-10-18 18:35:23   1         out    6       -74

I can get the overall total of each user this way, but it takes a few minutes:

SELECT 
    user_id,
    sum(case when io='in' then amount else -1*amount end) as balance
FROM 
    transactions 
GROUP BY
    user_id

I think expanding on this, a OVER/PARTITION clause will be a good call, but I'm not sure if it's the right call given the size of this database.

Thanks for the help.

Edit: I should mention, the real data may include duplicates in the time column. Transactions could have occurred at the same time, as it's only granular to the second.

Solution

A small tweak to your attempt should do it. It's sufficient to turn your sum into a "running sum" using the corresponding window function, that will compute the running amount by partitioning on user and ordering on time.

And if you have tied times, you can rely on ordering by id, which will break the tie and make the sum work correctly.

SELECT *,
       SUM(CASE WHEN io = 'in' 
                THEN amount 
                ELSE -amount 
           END) OVER(PARTITION BY user_id ORDER BY time, id) as balance
FROM transactions 
ORDER BY time

Output:

id	time	user_id	io	amount	balance
38hw	2019-10-18 18:35:09	2	in	1	1
nv49	2019-10-18 18:35:10	3	in	50	50
83ha	2019-10-18 18:35:11	5	in	2	2
ja03	2019-10-18 18:35:12	4	out	2	-2
019c	2019-10-18 18:35:13	1	out	75	-75
ac5r	2019-10-18 18:35:14	3	in	20	70
as30	2019-10-18 18:35:15	3	in	3	73
34ds	2019-10-18 18:35:16	4	in	7	5
12my	2019-10-18 18:35:17	2	in	50	51
dk20	2019-10-18 18:35:18	4	in	50	55
sk18	2019-10-18 18:35:19	1	in	7	-68
am35	2019-10-18 18:35:20	2	in	3	54
mc92	2019-10-18 18:35:21	2	out	8	46
alov	2019-10-18 18:35:22	3	in	4	77
ap34	2019-10-18 18:35:23	1	out	6	-74

Check the demo here.

Note: The last ORDER BY clause is not needed: it's just for visualization purposes.