Algorithm to simplify a weighted directed graph of debts

Here is an academic paper which investigates this problem in great detail. There is also some sample code for the different algorithms in Section 8 towards the end.

Verhoeff, T. (2004). Settling multiple debts efficiently : an invitation to computing science. Informatics in Education, 3(1), 105-126.


Simple algorithm

You can find in O(n) how much money who is expecting to get or pay. So you could simply create two lists, one for debit and the other for credit, and then balance the head of the two lists until they are empty. From your first example:

  • Initial state: Debit: (A: 25), Credit: (B: 15, C: 10)
  • First transaction, A:15 -> B: Debit: (A: 10), Credit: (C: 10)
  • Second transaction, A:10 -> C: Debit: (), Credit: ()

The transactions define the edges of your graph. For n persons involved, there will be at most n-1 transactions=edges. In the beginning, the total length of both lists is n. In each step, at least one of the lists (debit/credit) gets shorter by one, and in the last both lists disappear at once.

The issue is that, in general, this graph doesn't have to be similar to the original graph, which, as I get your intention, is a requirement. (Is it? There are cases where the optimal solution consists of adding new edges. Imagine A owing B and B owing C the same amount of money, A should pay C directly but this edge is not in the graph of debts.)

Less transactions

If the goal is just to construct an equivalent graph, you could search the creditor and debitor lists (as in the section above) for exact matches, or for cases where the sum of credit matches the debit of one person (or the other way round). Look for bin packing. For other cases you will have no other choice than splitting the flows, but even the simple algorithm above produces a graph which has one fewer edge than there are persons involved -- at most.

EDIT: Thanks to j_random_hacker for pointing out that a solution with less than n-1 edges is possible iff there is a group of persons whose total debts matches the credit of another group of persons: Then, the problem can be split into two subproblems with a total cost of n-2 edges for the transaction graph. Unfortunately, the subset sum problem is NP-hard.

A flow problem?

Perhaps this also can be transformed to a min-cost flow problem. If you want just to simplify your original graph, you construct a flow on it, the edge capacities are the original amounts of debit/credit. The debitors serve as inflow nodes (through a connector node which serves all debitors with edges of capacity that equals their total debt), the creditors are used as outflow nodes (with a similar connector node).

If you want to minimize the number of transactions, you will prefer keeping the "big" transactions and reducing the "small" ones. Hence, the cost of each edge could be modeled as the inverse of the flow on that edge.