3SUM problem (finding triplets) in better than O(n^2)
First of all. Finding all triplets in worst case is O(n^3)
. Suppose you have n=3k
numbers. K of them are 3, k are 4 and k are 5.
3,....,3,4,....,4,5,.....5
There are k^3 = n^3/27 = O(n^3)
such triplets. Just printing them takes O(n^3)
time.
Next will be explaining of 3SUM problem in such form:
There are numbers s_1, ..., s_n
each of them in range [-u;u]
. How many triplets a,b,c
there are that a+b=c
?
transforming. Get 2*u numbers
a_-u, ..., a_0, a_1, ... a_u
.a_i
is amount of numberss_j
, thats_j = i
. This is done inO(n+u)
res = a_0 * sum(i=-u..u, i=/=0, C(a_i, 2)) + C(a_0,3)
a_0 = 0
Build a polynom
P(x) = sum(i = -u...u, a_i*x^(i+u)
.Find
Q(x) = P(x)*P(x)
using FFT.Note that
Q(x) = sum(i=-2u..2u, b_i*x^(i+2*u))
, whereb_i
is number of pairss_u
,s_k
thats_u+s_k = i
(This includes using same number twice).For all even
i
dob_i = b_i - a_(i/2)
. This will remove using same number twice.- Sum all
b_i*a_i/2
- add this to res.
Example: to be more simple, I will assume that range for numbers is [0..u] and won't use any +u in powers of x.
Suppose that we have numbers 1,2,3
- a_0 = 0, a_1 = 1, a_2 = 1, a_3 = 1
res = 0
P(x) = x + x^2 + x^3
Q(x) = x^2 +2x^3+3x^4+2x^5+x^6
After subtracting
b_2 = 0, b_3 = 2, b_4 = 2, b_5 = 2, b_6 = 0
res += 0*1/2 + 2*1/2 + 2*0/2 + 2*0/2 + 6*0/2 = 1
Another possibility (who can fathom the mind of an interviewer?) would be to rewrite the equation as:
x^2 + y^2 = z^2
x^2 = z^2 - y^2 = (z-y)(z+y)
If we knew the prime factorisation of x^2 then we could simply iterate through all possible factorisations into a pair of numbers p,q (with p < q) and compute
x^2 = p.q = (z-y)(z+y)
p+q = (z-y)+(z+y) = 2z
q-p = (z+y)-(z-y) = 2y
z = (p+q)/2
y = (q-p)/2
So given a factorisation x^2=p.q we can work out the z and y values. By putting all the integer values into a set we can then check each possible answer in time O(1) (per check) by looking to see if those z,y values are in the array (taking care that negative values are also detected).
Wikipedia says that a randomly chosen integer has about log(n) divisors so this should take about n.log(n) assuming you can do the factorisation fast enough (e.g. if you knew all the integers were under a million you could precompute an array of the smallest factor for each integer).