Convolve integers in subquadratic time
Python 2, 250 bytes, n^log_2(3)
s=lambda a,b,S=1:[x+y*S for x,y in zip(a,b)]
def f(a,A):
n=len(a);m=n/2
if m<1:return[a[0]*A[0]]
if n%2:return f([0]+a,[0]+A)[2:]
b,c=a[:m],a[m:];B,C=A[:m],A[m:];p=f(b,B);r=f(c,C);z=m*[0];return s(z+z+r,s(z+s(f(s(b,c),s(B,C)),s(p,r),-1)+z,p+z+z))
This isn't well golfed, but more a proof of concept of the algorithm. Let's look at a more readable version.
s=lambda a,b,S=1:[x+y*S for x,y in zip(a,b)]
def f(a,A):
n=len(a)
if n==1:return[a[0]*A[0]]
if n%2:return f([0]+a,[0]+A)[2:]
m=n/2
b,c=a[:m],a[m:]
B,C=A[:m],A[m:]
p=f(b,B)
r=f(c,C)
q=s(f(s(b,c),s(B,C)),s(p,r),-1)
z=m*[0]
return s(z+z+r,s(z+q+z,p+z+z))
The idea is to divide and conquer adapting the Karatsuba multiplication algorithm to do convolution. As before, it runs in time n^log_2(3)
, as it makes 3 recursive calls which shrink the input length in half. This isn't as good as the quasilinear of DFT, but still better than quadratic.
We can think of convolution as multiplication of formal polynomials. Alternatively, if multiplication were done without regrouping, you'd get convolution. So, adapting the algorithm only requires doing vector addition and subtraction instead of the numerical operations. These additions are still linear-time.
Unfortunately for Python, it doesn't have built-in vector addition and subtraction, so we hack together a function s
to do both: addition by default, and subtraction when passed an optional argument of -1
. Whenever we add, we make sure to pad with zeroes to avoid truncation.
We handle odd list lengths by padding with a zero to an even length, removing the extraneous zeroes that result.
J, 79 bytes, O(n log n)
f=:_2&(0((+,-)]%_1^i.@#%#)&f/@|:]\)~1<#
<:@+&#$9(o.f%#)[:+@*/;f@{.&>~2^#@#:@+&#
Uses the Convolution theorem which states that the Fourier transform of the convolution of two sequences is equal to the Fourier transform of each sequence multiplied together elementwise.
Fourier( Convolve(a, b) ) = Fourier(a) * Fourier(b)
Convolve(a, b) = InverseFourier( Fourier(a) * Fourier(b) )
Also uses the fact the the inverse Fourier transform can be applied by using the Fourier transform of the conjugate.
InverseFourier(a) = Conjugate( Fourier( Conjugate(a) ) ) / Length(a)
I used the FFT implementation from a previous solution. That implementation uses the Cooley-Tukey algorithm for sequences where the length is a power of 2. Therefore, I have to zero-pad the input sequences to the a power of 2 (typically the minimal value that is valid) such that their lengths are greater than or equal to the sum of the lengths of the input sequences.
Usage
f =: _2&(0((+,-)]%_1^i.@#%#)&f/@|:]\)~1<#
g =: <:@+&#$9(o.f%#)[:+@*/;f@{.&>~2^#@#:@+&#
1 2 3 4 g 5 6 7 8
5 16 34 60 61 52 32
2 4 5 6 1 g 1 2 4 7
2 8 21 46 61 61 46 7
Try it online!
Explanation
An explanation for the FFT portion in J is included in my previous solution to a different challenge.
<:@+&#$9(o.f%#)[:+@*/;f@{.&>~2^#@#:@+&# Input: A on LHS, B on RHS
# Get the lengths of A and B
+& Sum them
#:@ Get list of binary digits
#@ Length
2^ Raise 2 to that power, call it N
; Link A and B as a pair of boxed arrays
&>~ For each box
{. Take N values, extending with zeros
f@ Apply FFT to it
[: */ Reduce using multiplication
+@ Conjugate
f Apply FFT
% Divide by
# Length
9 o. Take the real part
# Get the lengths of A and B
+& Sum them
<:@ Decrement
$ Shape to that length and return