3SUM problem (finding triplets) in better than O(n^2)

First of all. Finding all triplets in worst case is O(n^3). Suppose you have n=3k numbers. K of them are 3, k are 4 and k are 5.

3,....,3,4,....,4,5,.....5

There are k^3 = n^3/27 = O(n^3) such triplets. Just printing them takes O(n^3) time.

Next will be explaining of 3SUM problem in such form:

There are numbers s_1, ..., s_n each of them in range [-u;u]. How many triplets a,b,c there are that a+b=c?

  1. transforming. Get 2*u numbers a_-u, ..., a_0, a_1, ... a_u. a_i is amount of numbers s_j, that s_j = i. This is done in O(n+u)

  2. res = a_0 * sum(i=-u..u, i=/=0, C(a_i, 2)) + C(a_0,3) a_0 = 0

  3. Build a polynom P(x) = sum(i = -u...u, a_i*x^(i+u).

  4. Find Q(x) = P(x)*P(x) using FFT.

  5. Note that Q(x) = sum(i=-2u..2u, b_i*x^(i+2*u)), where b_i is number of pairs s_u,s_k that s_u+s_k = i (This includes using same number twice).

  6. For all even i do b_i = b_i - a_(i/2). This will remove using same number twice.

  7. Sum all b_i*a_i/2 - add this to res.

Example: to be more simple, I will assume that range for numbers is [0..u] and won't use any +u in powers of x. Suppose that we have numbers 1,2,3 - a_0 = 0, a_1 = 1, a_2 = 1, a_3 = 1

  • res = 0

  • P(x) = x + x^2 + x^3

  • Q(x) = x^2 +2x^3+3x^4+2x^5+x^6

  • After subtracting b_2 = 0, b_3 = 2, b_4 = 2, b_5 = 2, b_6 = 0

  • res += 0*1/2 + 2*1/2 + 2*0/2 + 2*0/2 + 6*0/2 = 1


Another possibility (who can fathom the mind of an interviewer?) would be to rewrite the equation as:

 x^2 + y^2 = z^2
 x^2 = z^2 - y^2 = (z-y)(z+y)

If we knew the prime factorisation of x^2 then we could simply iterate through all possible factorisations into a pair of numbers p,q (with p < q) and compute

 x^2 = p.q = (z-y)(z+y)
 p+q = (z-y)+(z+y) = 2z
 q-p = (z+y)-(z-y) = 2y
 z = (p+q)/2
 y = (q-p)/2

So given a factorisation x^2=p.q we can work out the z and y values. By putting all the integer values into a set we can then check each possible answer in time O(1) (per check) by looking to see if those z,y values are in the array (taking care that negative values are also detected).

Wikipedia says that a randomly chosen integer has about log(n) divisors so this should take about n.log(n) assuming you can do the factorisation fast enough (e.g. if you knew all the integers were under a million you could precompute an array of the smallest factor for each integer).

Tags:

Algorithm