Search code examples
algorithmlanguage-agnostic

Minimize transaction costs for settling debts in a pool


Suppose a group of friends shared some expenses across a month, and need to settle debts at the end. Each friend has a certain amount of money they should give/receive (the sum of debts and receivable amounts is zero), but everything must be settled with direct transfers (no central pool, only money from one person to another), and each transfer has a cost.

For example, suppose 3 people, A, B and C.

  • Person A must pay $100
  • Person B must receive $50
  • Person C must receive $50

The cost of each transaction can be described by the following matrix(the person in the row paying to the person in the column).

enter image description here

Given these costs, the optimal solution would be

  • Person A transfers $100 to Person B
  • Person B transfers $50 to Person C

This settles all debts with a transaction cost of 2. How can I generalize this?

All I could find searching was for simplifying chain debts (person A owes person B that owes person C, so person A owes person C).

The closest I found is this, but it doesn't have costs for transactions.

Backstory(if anyone is interested):

We live in a house with 8 people, and each month we pay bills from our own money and annotate it in a spreadsheet so at the end of the month we share the expenses fairly. However we have accounts on different banks, and some of these require fees for transferring money to other banks, so we prefer to keep transactions from the same bank.


Solution

  • After @j_random_hacker explained this problem is np-hard in the comments, I lost all hope of making a pretty solution and set out to do one that works.

    Following @user3386109's suggestion, also in the comments, I based myself on minpaths to solve the problem. So I began by using the [Floyd-Warshall algorithm] to find the minimum cost to get money from a person to another for every pair. This yields a matrix of the minimum cost for transferring money from the person in the row to the person in the column. I also applied the modification(it is available in the Wikipedia article in the Path Reconstruction section) to obtain a matrix that yields the next node(next hop, the actual person you must send your money if want to reach the person in the column with minimal cost) for the person in the row to transfer money to the person in the column.

    Example of the initialized matrixes, and the result after running the algorithm(The important elements that changed have a red dot):
    Example of matrixes before and after running the Floyd-Warshall algorithm

    I then decided to make simple Branch-and-Bound recursion. Every recursion call would receive a list of debts, a matrix for all transactions so far and the cost to reach this state. It must return TODO. We keep a global for the best solution found, and at the beginning of the recursion call check if the cost to reach this state is already worst than the global best, if it is, we return an "infinite" cost to signal this is not a solution we need to consider.

    Then we select someone that owes money in the debt list and for each person that must receive money, we create copies of the debt list and transaction matrix, and simulate the transaction of the maximum amount of money between these two people(If A must pay 100 but C only needs to receive 50, the maximum is 50). The transactions matrix is modified by increasing all transaction in the minpath for these two people by the amount transferred. The cost is incremented if we increased an element that was previously zero. We then call the recursion. If the debt list reaches 0, a solution was found, the global minimum cost is updated and the cost returned is 0.

    In a previous version, I spawned a recursion for each pair of owing/receiving person. This yielded horrible performance, and proved unnecessary, since the order of transactions doesn't matter, and any debts that weren't yet settled we be treated at lower levels of recursion.

    This algorithm seemed correct, but there was still a problem!

    In the following example:
    matrix of transaction weights for a failed example

    • Person A must pay 40$
    • Person B must pay 40$
    • Person C must pay 20$
    • Person D must receive 100$

    The algorithm as it is now, makes A, B, C each make a transfer to D. The actual best way would be to choose one of A, B or C to transfer all the money to D in only one payment.

    In this example, person A,B and C all have the same extremely high cost to transfer to D, and the next hop for all of them to get money to D is to go directly to D. However, the optimal solution would be to make everyone transfer all their money to a single person n transfer it to D all in one go. The algorithm is failing to recognize that since someone already made a transfer to person D, the cost of this transaction is 0, and so we should use this new path. To address this issue, I included a matrix for costs and a matrix for paths in the recursion, at the beginning of the recursion we set all costs of transfers that have already been made in this recursion branch to 0, and run Floyd-Warshall algorithm again. The recursion uses these matrixes instead of the global ones and passes them on. Sure, it means multiplying complexity by V^3, but it's the only way I found to solve this issue.

    The algorithm seems to be working now, but I will probably keep trying to improve it, especially in code readability. The complete code is available at: My gitlab project, inside the calculations folder

    Sorry for the long answer and delay in posting it, but I found important to document my work thoroughly.