Search code examples
pythontransactionsmaxlist-comprehensionaverage

Finding average and max value per account from a list


so I've got this data of a type 'list' (this is just a fraction of it to illustrate the point) stored in a variable called sorted_trans (I have pre-sorted it to group accountIds together):

Transaction(transactionId='T000664', accountId='A1', transactionDay=21, category='GG', transactionAmount=364.69), Transaction(transactionId='T000776', accountId='A1', transactionDay=24, category='GG', transactionAmount=329.63), Transaction(transactionId='T000313', accountId='A10', transactionDay=8, category='AA', transactionAmount=960.56), Transaction(transactionId='T000472', accountId='A10', transactionDay=12, category='AA', transactionAmount=707.74), Transaction(transactionId='T000596', accountId='A10', transactionDay=18, category='AA', transactionAmount=156.77), Transaction(transactionId='T000730', accountId='A10', transactionDay=23, category='AA', transactionAmount=577.76)

My task is to sort this data by 'accountId' for the last 5 days (I'm using the rolling window) and calculate: total 'transactionAmount' per 'accountId', average 'transactionAmount' per 'accountId' and return the maximum 'transactionAmount' per 'accountId'

To capture data for the last five days (excluding the current day), I'm using a normal for-loop

for i in range(day-window_size, day): 

I suspect I would need to use some sort of a comprehension to group and calculate the right values. The output should contain one line per day per account id and each line should contain each of the calculated statistics, for example: Output examle. I've managed to get everything but the average and max.

For example, I've calculated total value per category AA (same for other categories) with this code:

trans_AA = []
trans_AA.append([(x.accountId, x.transactionAmount) for x in sorted_trans  if x.category == "AA"])

Generic method to calculate total transactionAmount per category

def totals_per_cat(transactions):
    
    for accountId, transactionAmount in chain.from_iterable(transactions):
                 d[accountId] += transactionAmount
    trans_res = list(d.values())
    return trans_res

How to calculate average and max per accountId for the past 5 days?


Solution

  • This will hopefully get you most of the way there:

    from collections import defaultdict
    from dataclasses import dataclass, fields
    from csv import DictReader
    from statistics import mean
    from typing import Dict, List
    
    
    
    @dataclass
    class Transaction:
        transactionId: str
        accountId: str
        transactionDay: int
        category: str
        transactionAmount: float 
    
    
    def load_transactions(file_path: str) -> List[Transaction]:
        with open(file_path) as csv_file:
            return [
                Transaction(**{
                    field.name: field.type(row[field.name])
                    for field in fields(Transaction)
                }) for row in DictReader(csv_file)
            ]
    
    transactions = load_transactions('transactions.txt')
    min_day = min(xact.transactionDay for xact in transactions)
    max_day = max(xact.transactionDay for xact in transactions)
    window_size = 5
    
    xact_by_acct: Dict[str, List[Transaction]] = defaultdict(list)
    for xact in transactions:
        xact_by_acct[xact.accountId].append(xact)
    
    for day in range(min_day + window_size, max_day + 1):
        for acct, xacts in xact_by_acct.items():
            amts = [
                xact.transactionAmount for xact in xacts 
                if xact.transactionDay in range(day - window_size, day)
            ]  # all amounts within the window
            try:
                print(
                    f"Day: {day}\tAccount: {acct}\t"
                    f"Max: {max(amts)}\tAverage: {mean(amts)}"
                )
            except ValueError:
                # No transactions in the window for this account.
                pass