Cluster numbers into n partitions so that each partitions sum is closest to total/n
For lists of the size in the example, brute-forcing s/b fine:
Module[{d = #1, m = #2, l = Length@#1, dt = (Tr@#1)/#2, dtp, parted},
parted =
Internal`PartitionRagged[d, #] & /@
Join @@ Permutations /@ IntegerPartitions[l, {m}];
dtp = Total[Abs[Total[parted, {3}] - dt], {2}];
Pick[parted, dtp, Min@dtp]] &[data, 4]
(*
{{{16, 4, 17, 10}, {15, 4, 4, 6, 7, 14, 9}, {17, 27, 6, 1, 9}, {0, 12, 20, 8, 0, 3, 4, 0, 3, 4}},
{{16, 4, 17, 10}, {15, 4, 4, 6, 7, 14, 9}, {17, 27, 6, 1, 9, 0}, {12, 20, 8, 0, 3, 4, 0, 3, 4}}}
*)
I chose minimization of sum of absolute differences from goal as the metric, you can change it (dtp=...
) to whatever floats your boat.
Here is a simple approach that splits the data
into 4 sequences, with each sequence being as close to 55
as possible
drop[list_, m_] :=
Drop[list, (Position[#, Sequence @@ Nearest[#, m]] &@ Accumulate@list)[[1, 1]]]
take[list_, m_] :=
Take[list, (Position[#, Sequence @@ Nearest[#, m]] &@ Accumulate@list)[[1, 1]]]
NestList[{take[#[[2]], 55], drop[#[[2]], 55]} &, {{}, data}, 4][[2 ;;, 1]]
{{16, 4, 17, 10, 15}, {4, 4, 6, 7, 14, 9, 17}, {27, 6, 1, 9, 0, 12}, {20, 8, 0, 3, 4, 0, 3, 4}}
Total /@ %
{62, 61, 55, 42}
One could use
pos[list_, m_] := (Position[#, Sequence @@ Nearest[#, m]] &@ Accumulate@list)[[1, 1]]
NestList[{pos[#[[2]], 55], drop[#[[2]], 55]} &, {{}, data}, 4][[2 ;;, 1]]
{5, 7, 6, 8}
to get some starting values, that could be used for some code that splits the data
into sequences using a global measure for the difference to 55
.
Here's an overengineered approach using the method of simulated annealing. I apologize for the poor style of coding, this is something I had lying around found somewhere online and modified just now for this task.
data = {16, 4, 17, 10, 15, 4, 4, 6, 7, 14, 9, 17, 27, 6, 1, 9, 0, 12, 20, 8, 0, 3, 4, 0, 3, 4};
Solution = Partition[Range@26, 7, 7, 1, {}];
Optimum = Solution;
Fitness[list_] := Norm@(Total /@ (data[[#]] & /@ list) - 55);
Iterate[Sol_, temp_] :=
Module[{nSol = Sol, fromto = RandomSample[Range@4, 2], index, val},
While[Length@nSol[[First@fromto]] == 0,
fromto = RandomSample[Range@4, 2]];
index = RandomInteger[{1, Length@nSol[[First@fromto]]}];
val = nSol[[First@fromto, index]];
nSol = Delete[nSol, {First@fromto, index}];
nSol = Insert[nSol, val, {Last@fromto, 1}];
If[Fitness[nSol] < Fitness[Optimum], Optimum = nSol];
Solution =
RandomChoice[{Exp[-Fitness[nSol]/temp],
Exp[-Fitness[Sol]/temp]} -> {nSol, Sol}]]
ListPlot@(tempSched = (.9 + Tanh[(10000 - #)/15000]) & /@
Range[30000])
ListPlot@(Fitness /@ FoldList[Iterate, Solution, tempSched])
The current solution is a partition of the list of indices. With each step the program randomly takes a number from one partition and sticks it into another. It takes then a random choice between the newly proposed solution and the old one based on the current "temperature" and the fitness of either solution (how close they are to summing to 55). Also, if it finds a better solution than any previous one it saves it for future use.
After running the above, let's check the results.
Solution
data[[#]] & /@ Solution
Total /@ %
{{21, 24, 17, 7, 2, 16, 4, 18, 5}, {14, 9, 12, 13}, {15, 22, 6, 26, 8, 23, 11, 20, 1}, {25, 10, 3, 19}} {{0, 0, 0, 4, 4, 9, 10, 12, 15}, {6, 7, 17, 27}, {1, 3, 4, 4, 6, 4, 9, 8, 16}, {3, 14, 17, 20}} {54, 57, 55, 54}
Quite good, but we can do better.
Optimum
data[[#]] & /@ Optimum
Total /@ %
{{17, 8, 14, 9, 11, 18, 5}, {25, 20, 12, 13}, {15, 21, 4, 26, 6, 24, 7, 22, 23, 16, 1}, {2, 10, 3, 19}} {{0, 6, 6, 7, 9, 12, 15}, {3, 8, 17, 27}, {1, 0, 10, 4, 4, 0, 4, 3, 4, 9, 16}, {4, 14, 17, 20}} {55, 55, 55, 55}
Exactly as desired. Finally OP requests to maintain the order of the data, so
data[[#]] & /@ SortBy[Sort /@ Optimum, First]
{{16, 10, 4, 4, 1, 9, 0, 3, 4, 0, 4}, {4, 17, 14, 20}, {15, 6, 7, 9, 6, 0, 12}, {17, 27, 8, 3}}