Efficient computation of n choose k in Node.js

The following algorithm has a run-time complexity of O(1) given a linear look-up table of log-factorials with space-complexity O(n).

Limiting n and k to the range [0, 1000] makes sense since binomial(1000, 500) is already dangerously close to Number.MAX_VALUE. We would thus need a look-up table of size 1000.

On a modern JavaScript engine, a compact array of n numbers has a size of n * 8 bytes. A full look-up table would thus require 8 kilobytes of memory. If we limit our input to the range [0, 100], the table would only occupy 800 bytes.

var logf = [0, 0, 0.6931471805599453, 1.791759469228055, 3.1780538303479458, 4.787491742782046, 6.579251212010101, 8.525161361065415, 10.60460290274525, 12.801827480081469, 15.104412573075516, 17.502307845873887, 19.987214495661885, 22.552163853123425, 25.19122118273868, 27.89927138384089, 30.671860106080672, 33.50507345013689, 36.39544520803305, 39.339884187199495, 42.335616460753485, 45.38013889847691, 48.47118135183523, 51.60667556776438, 54.78472939811232, 58.00360522298052, 61.261701761002, 64.55753862700634, 67.88974313718154, 71.25703896716801, 74.65823634883016, 78.0922235533153, 81.55795945611504, 85.05446701758152, 88.58082754219768, 92.1361756036871, 95.7196945421432, 99.33061245478743, 102.96819861451381, 106.63176026064346, 110.32063971475739, 114.0342117814617, 117.77188139974507, 121.53308151543864, 125.3172711493569, 129.12393363912722, 132.95257503561632, 136.80272263732635, 140.67392364823425, 144.5657439463449, 148.47776695177302, 152.40959258449735, 156.3608363030788, 160.3311282166309, 164.32011226319517, 168.32744544842765,  172.3527971391628, 176.39584840699735, 180.45629141754378, 184.53382886144948, 188.6281734236716, 192.7390472878449, 196.86618167289, 201.00931639928152, 205.1681994826412, 209.34258675253685, 213.53224149456327, 217.73693411395422, 221.95644181913033, 226.1905483237276, 230.43904356577696, 234.70172344281826, 238.97838956183432, 243.2688490029827, 247.57291409618688, 251.8904022097232, 256.22113555000954, 260.5649409718632, 264.9216497985528, 269.2910976510198, 273.6731242856937, 278.0675734403661, 282.4742926876304, 286.893133295427, 291.3239500942703, 295.76660135076065, 300.22094864701415, 304.6868567656687, 309.1641935801469, 313.65282994987905, 318.1526396202093, 322.66349912672615, 327.1852877037752, 331.7178871969285, 336.26118197919845, 340.815058870799, 345.37940706226686, 349.95411804077025, 354.5390855194408, 359.1342053695754, 363.73937555556347];

function binomial(n, k) {
    return Math.exp(logf[n] - logf[n-k] - logf[k]);
}

console.log(binomial(5, 3));

Explanation

Starting with the original iterative algorithm, we first replace the product with a sum of logarithms:

function binomial(n, k) {
    var logresult = 0;
    for (var i = 1; i <= k; i++) {
        logresult += Math.log(n + 1 - i) - Math.log(i);
    }
    return Math.exp(logresult);
}

Our loop now sums over k terms. If we rearrange the sum, we can easily see that we sum over consecutive logarithms log(1) + log(2) + ... + log(k) etc. which we can replace by a sum_of_logs(k) which is actually identical to log(k!). Precomputing these values and storing them in our lookup-table logf then leads to the above one-liner algorithm.

Computing the look-up table:

I recommend precomputing the lookup-table with higher precision and converting the resulting elements to 64-bit floats. If you do not need that little bit of additional precision or want to run this code on the client side, use this:

var size = 1000, logf = new Array(size);
logf[0] = 0;
for (var i = 1; i <= size; ++i) logf[i] = logf[i-1] + Math.log(i);

Numerical precision:

By using log-factorials, we avoid precision problems inherent to storing raw factorials.

We could even use Stirling's approximation for log(n!) instead of a lookup-table and still get 12 significant figures for above computation in both run-time and space complexity O(1):

function logf(n) {
    return n === 0 ? 0 : (n + .5) * Math.log(n) - n + 0.9189385332046728 + 0.08333333333333333 / n - 0.002777777777777778 * Math.pow(n, -3);
}

function binomial(n , k) {
    return Math.exp(logf(n) - logf(n - k) - logf(k));
}

console.log(binomial(1000, 500)); // 2.7028824094539536e+299

Never compute factorials, they grow too quickly. Instead compute the result you want. In this case, you want the binomial numbers, which have an incredibly simple geometric construction: you can build pascal's triangle, as you need it, and do it using plain arithmetic.

Start with [1] and [1,1]. The next row has [1] at the start, [1+1] in the middle, and [1] at the end: [1,2,1]. Next row: [1] at the start, the sum of the first two terms in spot 2, the sum of the next two terms in spot three, and [1] at the end: [1,3,3,1]. Next row: [1], then 1+3=4, then 3+3=6, then 3+1=4, then [1] at the end, and so on and so on. As you can see, no factorials, logarithms, or even multiplications: just super fast addition with clean integer numbers. So simple, you can build a massive lookup table by hand.

And you should.

Never compute in code what you can compute by hand and just include as constants for immediate lookup; in this case, writing out the table for up to something around n=20 is absolutely trivial, and you can then just use that as your "starting LUT" and probably never even access the high rows.

But, if you do need them, or more, then because you can't build an infinite lookup table you compromise: you start with a pre-specified LUT, and a function that can "fill it up" to some term you need that's not in it yet:

// step 1: a basic LUT with a few steps of Pascal's triangle
const binomials = [
  [1],
  [1,1],
  [1,2,1],
  [1,3,3,1],
  [1,4,6,4,1],
  [1,5,10,10,5,1],
  [1,6,15,20,15,6,1],
  [1,7,21,35,35,21,7,1],
  [1,8,28,56,70,56,28,8,1],
  //  ...
];

// step 2: a function that builds out the LUT if it needs to.
module.exports = function binomial(n,k) {
  while(n >= binomials.length) {
    let s = binomials.length;
    let nextRow = [];
    nextRow[0] = 1;
    for(let i=1, prev=s-1; i<s; i++) {
      nextRow[i] = binomials[prev][i-1] + binomials[prev][i];
    }
    nextRow[s] = 1;
    binomials.push(nextRow);
  }
  return binomials[n][k];
};

Since this is an array of ints, the memory footprint is tiny. For a lot of work involving binomials, we realistically don't even need more than two bytes per integer, making this a minuscule lookup table: we don't need more than 2 bytes until you need binomials higher than n=19, and the full lookup table up to n=19 takes up a measly 380 bytes. This is nothing compared to the rest of your program. Even if we allow for 32 bit ints, we can get up to n=35 in a mere 2380 bytes.

So the lookup is fast: either O(constant) for previously computed values, (n*(n+1))/2 steps if we have no LUT at all (in big O notation, that would be O(n²), but big O notation is almost never the right complexity measure), and somewhere in between for terms we need that aren't in the LUT yet. Run some benchmarks for your application, which will tell you how big your initial LUT should be, simply hard code that (seriously. these are constants, they're are exactly the kind of values that should be hard coded), and keep the generator around just in case.

However, do remember that you're in JavaScript land, and you are constrained by the JavaScript numerical type: integers only go up to 2^53, beyond that the integer property (every n has a distinct m=n+1 such that m-n=1) is not guaranteed. This should hardly ever be a problem, though: once we hit that limit, we're dealing with binomial coefficients that you should never even be using.

Efficient computation of n choose k in Node.js

Tags:

Javascript

Performance

Algorithm

Node.Js

Related

Recent Posts