Is there an efficient way to generate N random integers in a range that have a given sum or average?
Here's my solution in Java. It is fully functional and contains two generators: PermutationPartitionGenerator
for unsorted partitions and CombinationPartitionGenerator
for sorted partitions. Your generator also implemented in the class SmithTromblePartitionGenerator
for comparison. The class SequentialEnumerator
enumerates all possible partitions (unsorted or sorted, depending on the parameter) in sequential order. I have added thorough tests (including your test cases) for all of these generators.
The implementation is self-explainable for the most part. If you have any questions, I will answer them in couple of days.
import java.util.Random;
import java.util.function.Supplier;
public abstract class PartitionGenerator implements Supplier<int[]>{
public static final Random rand = new Random();
protected final int numberCount;
protected final int min;
protected final int range;
protected final int sum; // shifted sum
protected final boolean sorted;
protected PartitionGenerator(int numberCount, int min, int max, int sum, boolean sorted) {
if (numberCount <= 0)
throw new IllegalArgumentException("Number count should be positive");
this.numberCount = numberCount;
this.min = min;
range = max - min;
if (range < 0)
throw new IllegalArgumentException("min > max");
sum -= numberCount * min;
if (sum < 0)
throw new IllegalArgumentException("Sum is too small");
if (numberCount * range < sum)
throw new IllegalArgumentException("Sum is too large");
this.sum = sum;
this.sorted = sorted;
}
// Whether this generator returns sorted arrays (i.e. combinations)
public final boolean isSorted() {
return sorted;
}
public interface GeneratorFactory {
PartitionGenerator create(int numberCount, int min, int max, int sum);
}
}
import java.math.BigInteger;
// Permutations with repetition (i.e. unsorted vectors) with given sum
public class PermutationPartitionGenerator extends PartitionGenerator {
private final double[][] distributionTable;
public PermutationPartitionGenerator(int numberCount, int min, int max, int sum) {
super(numberCount, min, max, sum, false);
distributionTable = calculateSolutionCountTable();
}
private double[][] calculateSolutionCountTable() {
double[][] table = new double[numberCount + 1][sum + 1];
BigInteger[] a = new BigInteger[sum + 1];
BigInteger[] b = new BigInteger[sum + 1];
for (int i = 1; i <= sum; i++)
a[i] = BigInteger.ZERO;
a[0] = BigInteger.ONE;
table[0][0] = 1.0;
for (int n = 1; n <= numberCount; n++) {
double[] t = table[n];
for (int s = 0; s <= sum; s++) {
BigInteger z = BigInteger.ZERO;
for (int i = Math.max(0, s - range); i <= s; i++)
z = z.add(a[i]);
b[s] = z;
t[s] = z.doubleValue();
}
// swap a and b
BigInteger[] c = b;
b = a;
a = c;
}
return table;
}
@Override
public int[] get() {
int[] p = new int[numberCount];
int s = sum; // current sum
for (int i = numberCount - 1; i >= 0; i--) {
double t = rand.nextDouble() * distributionTable[i + 1][s];
double[] tableRow = distributionTable[i];
int oldSum = s;
// lowerBound is introduced only for safety, it shouldn't be crossed
int lowerBound = s - range;
if (lowerBound < 0)
lowerBound = 0;
s++;
do
t -= tableRow[--s];
// s can be equal to lowerBound here with t > 0 only due to imprecise subtraction
while (t > 0 && s > lowerBound);
p[i] = min + (oldSum - s);
}
assert s == 0;
return p;
}
public static final GeneratorFactory factory = (numberCount, min, max,sum) ->
new PermutationPartitionGenerator(numberCount, min, max, sum);
}
import java.math.BigInteger;
// Combinations with repetition (i.e. sorted vectors) with given sum
public class CombinationPartitionGenerator extends PartitionGenerator {
private final double[][][] distributionTable;
public CombinationPartitionGenerator(int numberCount, int min, int max, int sum) {
super(numberCount, min, max, sum, true);
distributionTable = calculateSolutionCountTable();
}
private double[][][] calculateSolutionCountTable() {
double[][][] table = new double[numberCount + 1][range + 1][sum + 1];
BigInteger[][] a = new BigInteger[range + 1][sum + 1];
BigInteger[][] b = new BigInteger[range + 1][sum + 1];
double[][] t = table[0];
for (int m = 0; m <= range; m++) {
a[m][0] = BigInteger.ONE;
t[m][0] = 1.0;
for (int s = 1; s <= sum; s++) {
a[m][s] = BigInteger.ZERO;
t[m][s] = 0.0;
}
}
for (int n = 1; n <= numberCount; n++) {
t = table[n];
for (int m = 0; m <= range; m++)
for (int s = 0; s <= sum; s++) {
BigInteger z;
if (m == 0)
z = a[0][s];
else {
z = b[m - 1][s];
if (m <= s)
z = z.add(a[m][s - m]);
}
b[m][s] = z;
t[m][s] = z.doubleValue();
}
// swap a and b
BigInteger[][] c = b;
b = a;
a = c;
}
return table;
}
@Override
public int[] get() {
int[] p = new int[numberCount];
int m = range; // current max
int s = sum; // current sum
for (int i = numberCount - 1; i >= 0; i--) {
double t = rand.nextDouble() * distributionTable[i + 1][m][s];
double[][] tableCut = distributionTable[i];
if (s < m)
m = s;
s -= m;
while (true) {
t -= tableCut[m][s];
// m can be 0 here with t > 0 only due to imprecise subtraction
if (t <= 0 || m == 0)
break;
m--;
s++;
}
p[i] = min + m;
}
assert s == 0;
return p;
}
public static final GeneratorFactory factory = (numberCount, min, max, sum) ->
new CombinationPartitionGenerator(numberCount, min, max, sum);
}
import java.util.*;
public class SmithTromblePartitionGenerator extends PartitionGenerator {
public SmithTromblePartitionGenerator(int numberCount, int min, int max, int sum) {
super(numberCount, min, max, sum, false);
}
@Override
public int[] get() {
List<Integer> ls = new ArrayList<>(numberCount + 1);
int[] ret = new int[numberCount];
int increasedSum = sum + numberCount;
while (true) {
ls.add(0);
while (ls.size() < numberCount) {
int c = 1 + rand.nextInt(increasedSum - 1);
if (!ls.contains(c))
ls.add(c);
}
Collections.sort(ls);
ls.add(increasedSum);
boolean good = true;
for (int i = 0; i < numberCount; i++) {
int x = ls.get(i + 1) - ls.get(i) - 1;
if (x > range) {
good = false;
break;
}
ret[i] = x;
}
if (good) {
for (int i = 0; i < numberCount; i++)
ret[i] += min;
return ret;
}
ls.clear();
}
}
public static final GeneratorFactory factory = (numberCount, min, max, sum) ->
new SmithTromblePartitionGenerator(numberCount, min, max, sum);
}
import java.util.Arrays;
// Enumerates all partitions with given parameters
public class SequentialEnumerator extends PartitionGenerator {
private final int max;
private final int[] p;
private boolean finished;
public SequentialEnumerator(int numberCount, int min, int max, int sum, boolean sorted) {
super(numberCount, min, max, sum, sorted);
this.max = max;
p = new int[numberCount];
startOver();
}
private void startOver() {
finished = false;
int unshiftedSum = sum + numberCount * min;
fillMinimal(0, Math.max(min, unshiftedSum - (numberCount - 1) * max), unshiftedSum);
}
private void fillMinimal(int beginIndex, int minValue, int fillSum) {
int fillRange = max - minValue;
if (fillRange == 0)
Arrays.fill(p, beginIndex, numberCount, max);
else {
int fillCount = numberCount - beginIndex;
fillSum -= fillCount * minValue;
int maxCount = fillSum / fillRange;
int maxStartIndex = numberCount - maxCount;
Arrays.fill(p, maxStartIndex, numberCount, max);
fillSum -= maxCount * fillRange;
Arrays.fill(p, beginIndex, maxStartIndex, minValue);
if (fillSum != 0)
p[maxStartIndex - 1] = minValue + fillSum;
}
}
@Override
public int[] get() { // returns null when there is no more partition, then starts over
if (finished) {
startOver();
return null;
}
int[] pCopy = p.clone();
if (numberCount > 1) {
int i = numberCount;
int s = p[--i];
while (i > 0) {
int x = p[--i];
if (x == max) {
s += x;
continue;
}
x++;
s--;
int minRest = sorted ? x : min;
if (s < minRest * (numberCount - i - 1)) {
s += x;
continue;
}
p[i++]++;
fillMinimal(i, minRest, s);
return pCopy;
}
}
finished = true;
return pCopy;
}
public static final GeneratorFactory permutationFactory = (numberCount, min, max, sum) ->
new SequentialEnumerator(numberCount, min, max, sum, false);
public static final GeneratorFactory combinationFactory = (numberCount, min, max, sum) ->
new SequentialEnumerator(numberCount, min, max, sum, true);
}
import java.util.*;
import java.util.function.BiConsumer;
import PartitionGenerator.GeneratorFactory;
public class Test {
private final int numberCount;
private final int min;
private final int max;
private final int sum;
private final int repeatCount;
private final BiConsumer<PartitionGenerator, Test> procedure;
public Test(int numberCount, int min, int max, int sum, int repeatCount,
BiConsumer<PartitionGenerator, Test> procedure) {
this.numberCount = numberCount;
this.min = min;
this.max = max;
this.sum = sum;
this.repeatCount = repeatCount;
this.procedure = procedure;
}
@Override
public String toString() {
return String.format("=== %d numbers from [%d, %d] with sum %d, %d iterations ===",
numberCount, min, max, sum, repeatCount);
}
private static class GeneratedVector {
final int[] v;
GeneratedVector(int[] vect) {
v = vect;
}
@Override
public int hashCode() {
return Arrays.hashCode(v);
}
@Override
public boolean equals(Object obj) {
if (this == obj)
return true;
return Arrays.equals(v, ((GeneratedVector)obj).v);
}
@Override
public String toString() {
return Arrays.toString(v);
}
}
private static final Comparator<Map.Entry<GeneratedVector, Integer>> lexicographical = (e1, e2) -> {
int[] v1 = e1.getKey().v;
int[] v2 = e2.getKey().v;
int len = v1.length;
int d = len - v2.length;
if (d != 0)
return d;
for (int i = 0; i < len; i++) {
d = v1[i] - v2[i];
if (d != 0)
return d;
}
return 0;
};
private static final Comparator<Map.Entry<GeneratedVector, Integer>> byCount =
Comparator.<Map.Entry<GeneratedVector, Integer>>comparingInt(Map.Entry::getValue)
.thenComparing(lexicographical);
public static int SHOW_MISSING_LIMIT = 10;
private static void checkMissingPartitions(Map<GeneratedVector, Integer> map, PartitionGenerator reference) {
int missingCount = 0;
while (true) {
int[] v = reference.get();
if (v == null)
break;
GeneratedVector gv = new GeneratedVector(v);
if (!map.containsKey(gv)) {
if (missingCount == 0)
System.out.println(" Missing:");
if (++missingCount > SHOW_MISSING_LIMIT) {
System.out.println(" . . .");
break;
}
System.out.println(gv);
}
}
}
public static final BiConsumer<PartitionGenerator, Test> distributionTest(boolean sortByCount) {
return (PartitionGenerator gen, Test test) -> {
System.out.print("\n" + getName(gen) + "\n\n");
Map<GeneratedVector, Integer> combos = new HashMap<>();
// There's no point of checking permus for sorted generators
// because they are the same as combos for them
Map<GeneratedVector, Integer> permus = gen.isSorted() ? null : new HashMap<>();
for (int i = 0; i < test.repeatCount; i++) {
int[] v = gen.get();
if (v == null && gen instanceof SequentialEnumerator)
break;
if (permus != null) {
permus.merge(new GeneratedVector(v), 1, Integer::sum);
v = v.clone();
Arrays.sort(v);
}
combos.merge(new GeneratedVector(v), 1, Integer::sum);
}
Set<Map.Entry<GeneratedVector, Integer>> sortedEntries = new TreeSet<>(
sortByCount ? byCount : lexicographical);
System.out.println("Combos" + (gen.isSorted() ? ":" : " (don't have to be uniform):"));
sortedEntries.addAll(combos.entrySet());
for (Map.Entry<GeneratedVector, Integer> e : sortedEntries)
System.out.println(e);
checkMissingPartitions(combos, test.getGenerator(SequentialEnumerator.combinationFactory));
if (permus != null) {
System.out.println("\nPermus:");
sortedEntries.clear();
sortedEntries.addAll(permus.entrySet());
for (Map.Entry<GeneratedVector, Integer> e : sortedEntries)
System.out.println(e);
checkMissingPartitions(permus, test.getGenerator(SequentialEnumerator.permutationFactory));
}
};
}
public static final BiConsumer<PartitionGenerator, Test> correctnessTest =
(PartitionGenerator gen, Test test) -> {
String genName = getName(gen);
for (int i = 0; i < test.repeatCount; i++) {
int[] v = gen.get();
if (v == null && gen instanceof SequentialEnumerator)
v = gen.get();
if (v.length != test.numberCount)
throw new RuntimeException(genName + ": array of wrong length");
int s = 0;
if (gen.isSorted()) {
if (v[0] < test.min || v[v.length - 1] > test.max)
throw new RuntimeException(genName + ": generated number is out of range");
int prev = test.min;
for (int x : v) {
if (x < prev)
throw new RuntimeException(genName + ": unsorted array");
s += x;
prev = x;
}
} else
for (int x : v) {
if (x < test.min || x > test.max)
throw new RuntimeException(genName + ": generated number is out of range");
s += x;
}
if (s != test.sum)
throw new RuntimeException(genName + ": wrong sum");
}
System.out.format("%30s : correctness test passed%n", genName);
};
public static final BiConsumer<PartitionGenerator, Test> performanceTest =
(PartitionGenerator gen, Test test) -> {
long time = System.nanoTime();
for (int i = 0; i < test.repeatCount; i++)
gen.get();
time = System.nanoTime() - time;
System.out.format("%30s : %8.3f s %10.0f ns/test%n", getName(gen), time * 1e-9, time * 1.0 / test.repeatCount);
};
public PartitionGenerator getGenerator(GeneratorFactory factory) {
return factory.create(numberCount, min, max, sum);
}
public static String getName(PartitionGenerator gen) {
String name = gen.getClass().getSimpleName();
if (gen instanceof SequentialEnumerator)
return (gen.isSorted() ? "Sorted " : "Unsorted ") + name;
else
return name;
}
public static GeneratorFactory[] factories = { SmithTromblePartitionGenerator.factory,
PermutationPartitionGenerator.factory, CombinationPartitionGenerator.factory,
SequentialEnumerator.permutationFactory, SequentialEnumerator.combinationFactory };
public static void main(String[] args) {
Test[] tests = {
new Test(3, 0, 3, 5, 3_000, distributionTest(false)),
new Test(3, 0, 6, 12, 3_000, distributionTest(true)),
new Test(50, -10, 20, 70, 2_000, correctnessTest),
new Test(7, 3, 10, 42, 1_000_000, performanceTest),
new Test(20, 3, 10, 120, 100_000, performanceTest)
};
for (Test t : tests) {
System.out.println(t);
for (GeneratorFactory factory : factories) {
PartitionGenerator candidate = t.getGenerator(factory);
t.procedure.accept(candidate, t);
}
System.out.println();
}
}
}
You can try this on Ideone.
Here is the algorithm from John McClane's PermutationPartitionGenerator, in another answer on this page. It has two phases, namely a setup phase and a sampling phase, and generates n
random numbers in [min
, max
] with the sum sum
, where the numbers are listed in random order.
Setup phase: First, a solution table is built using the following formulas (t(y, x)
where y
is in [0, n
] and x
is in [0, sum - n * min
]):
- t(0, j) = 1 if j == 0; 0 otherwise
- t(i, j) = t(i-1, j) + t(i-1, j-1) + ... + t(i-1, j-(max-min))
Here, t(y, x) stores the relative probability that the sum of y
numbers (in the appropriate range) will equal x
. This probability is relative to all t(y, x) with the same y
.
Sampling phase: Here we generate a sample of n
numbers. Set s
to sum - n * min
, then for each position i
, starting with n - 1
and working backwards to 0:
- Set
v
to a random integer in [0, t(i+1, s)). - Set
r
tomin
. - Subtract t(i, s) from
v
. - While
v
remains 0 or greater, subtract t(i, s-1) fromv
, add 1 tor
, and subtract 1 froms
. - The number at position
i
in the sample is set tor
.
EDIT:
It appears that with trivial changes to the algorithm above, it's possible to have each random number use a separate range rather than use the same range for all of them:
Each random number at positions i
∈ [0, n
) has a minimum value min(i) and a maximum value max(i).
Let adjsum
= sum
- Σmin(i).
Setup phase: First, a solution table is built using the following formulas (t(y, x)
where y
is in [0, n
] and x
is in [0, adjsum
]):
- t(0, j) = 1 if j == 0; 0 otherwise
- t(i, j) = t(i-1, j) + t(i-1, j-1) + ... + t(i-1, j-(max(i-1)-min(i-1)))
The sampling phase is then exactly the same as before, except we set s
to adjsum
(rather than sum - n * min
) and set r
to min(i) (rather than min
).
EDIT:
For John McClane's CombinationPartitionGenerator, the setup and sampling phases are as follows.
Setup phase: First, a solution table is built using the following formulas (t(z, y, x)
where z
is in [0, n
], y
is in [0, max - min
], and x
is in [0, sum - n * min
]):
- t(0, j, k) = 1 if k == 0; 0 otherwise
- t(i, 0, k) = t(i - 1, 0, k)
- t(i, j, k) = t(i, j-1, k) + t(i - 1, j, k - j)
Sampling phase: Here we generate a sample of n
numbers. Set s
to sum - n * min
and mrange
to max - min
, then for each position i
, starting with n - 1
and working backwards to 0:
- Set
v
to a random integer in [0, t(i+1, mrange, s)). - Set
mrange
to min(mrange
,s
) - Subtract
mrange
froms
. - Set
r
tomin + mrange
. - Subtract t(
i
,mrange
,s
) fromv
. - While
v
remains 0 or greater, add 1 tos
, subtract 1 fromr
and 1 frommrange
, then subtract t(i
,mrange
,s
) fromv
. - The number at position
i
in the sample is set tor
.
I have not tested this, so it is not really an answer, just something to try which is too long to fit into a comment. Start with an array which meets the first two criteria and play with it so it still meets the first two, but is a lot more random.
If the mean is an integer, then your initial array can be [4, 4, 4, ... 4] or maybe [3, 4, 5, 3, 4, 5, ... 5, 8, 0] or something simple like that. For a mean of 4.5, try [4, 5, 4, 5, ... 4, 5].
Next pick a pair of numbers, num1
and num2
, in the array. Probably the first number should be taken in order, as with the Fisher-Yates shuffle, the second number should be picked at random. Taking the first number in order ensures that every number is picked at least once.
Now calculate max-num1
and num2-min
. Those are the distances from the two numbers to the max
and min
boundaries. Set limit
to the smaller of the two distances. That is the maximum change allowed which will not put one or other of the numbers outside the allowed limits. If limit
is zero then skip this pair.
Pick a random integer in the range [1, limit
]: call it change
. I omit 0 from the pickable range as it has no effect. Testing may show that you get better randomness by including it; I'm not sure.
Now set num1 <- num1 + change
and num2 <- num2 - change
. That will not affect the mean value and all elements of the array are still within the required boundaries.
You will need to run through the whole array at least once. Testing should show if you need to run through it more than once to get something sufficiently random.
ETA: include pseudocode
// Set up the array.
resultAry <- new array size N
for (i <- 0 to N-1)
// More complex initial setup schemes are possible here.
resultAry[i] <- mean
rof
// Munge the array entries.
for (ix1 <- 0 to N-1) // ix1 steps through the array in order.
// Pick second entry different from first.
repeat
ix2 <- random(0, N-1)
until (ix2 != ix1)
// Calculate size of allowed change.
hiLimit <- max - resultAry[ix1]
loLimit <- resultAry[ix2] - min
limit <- minimum(hiLimit, loLimit)
if (limit == 0)
// No change possible so skip.
continue loop with next ix1
fi
// Change the two entries keeping same mean.
change <- random(1, limit) // Or (0, limit) possibly.
resultAry[ix1] <- resultAry[ix1] + change
resultAry[ix2] <- resultAry[ix2] - change
rof
// Check array has been sufficiently munged.
if (resultAry not random enough)
munge the array again
fi