A fast way to expand symbolic products
Not an answer, but just an attempt to analyze the problem. IMHO, what causes
Mathematica to break down in this calculation is not the number of terms, but their complexity (i.e. high LeafCount
). For example, we can try to substitute the products of Unique
symbols with simple c[i]
fac[n_] :=
Sum[Product[Unique[x], {i, 1, RandomInteger[{1, 5}]}] a[i], {i, 1, n}]
expr = fac[#] & /@ ConstantArray[10, 6];
counter = 0;
repFun[ex_Plus] :=
(
li = List @@ ex;
Table[counter++; h[li[[i]]] -> c[counter], {i, 1, Length[li]}]
)
The following code gives us the corresponding substitution rule
repRule = Flatten[repFun /@ expr]
which makes expr
look like
exprAbbr = Map[h /@ #1 &, expr] /. repRule;
this
{c(61)+c(62)+c(63)+c(64)+c(65)+c(66)+c(67)+c(68)+c(69)+c(70),
c(71)+c(72)+c(73)+c(74)+c(75)+c(76)+c(77)+c(78)+c(79)+c(80),
c(81)+c(82)+c(83)+c(84)+c(85)+c(86)+c(87)+c(88)+c(89)+c(90),
c(91)+c(92)+c(93)+c(94)+c(95)+c(96)+c(97)+c(98)+c(99)+c(100),
c(101)+c(102)+c(103)+c(104)+c(105)+c(106)+c(107)+c(108)+c(109)+c(110),
c(111)+c(112)+c(113)+c(114)+c(115)+c(116)+c(117)+c(118)+c(119)+c(120)}
Now we can expand exprAbbr
in about 2 seconds.
AbsoluteTiming[resRaw = Expand[Times @@ exprAbbr];]
However, when trying to perform backsubstitution we loose 16.5 seconds,
and this is where the high LeafCount
kicks in.
revRepRule = (Reverse /@ repRule) /. h -> Identity;
AbsoluteTiming[res = resRaw /. Dispatch[revRepRule];]
So at the end of the day it is true that the only sensible way to solve this problem is to switch to FORM.
In a way I find it somewhat sad, given the amount of effort put into development of Mathematica. However, it seems that good performance when handling very large expressions is not ranked very high in the priority list.
Edit:
In fact, even with FORM this expansion is not entirely trivial. Using
fac[n_] :=
Sum[Product[Unique["x"], {i, 1, RandomInteger[{1, 5}]}] a[i], {i, 1,
n}]
expr = fac[#] & /@ ConstantArray[10, 6];
str = "L expr = " <>
StringReplace[
ToString[Times @@ expr, InputForm], {"[" -> "(", "]" -> ")"}] <>
";"
SetDirectory[NotebookDirectory[]];
file = OpenWrite["expr.frm"];
WriteString[file, str]
Close[file];
we can convert the input expression into FORM format and then evaluate it using following FORM code
Auto S x;
CF a;
#include expr.frm
.sort
Format Mathematica;
#write <expr-expanded.m> "\"(%E)\"",expr
.end
Looking at the FORM output statistics
# time form formExpand.frm
FORM 4.1 (Aug 28 2019, v4.1-20131025-482-g0b3ab5d) 64-bits Run: Fri Dec 27 19:31:39 2019
Auto S x;
CF a;
#include expr.frm
L expr = (x73*x74*x75*a(1) + x76*a(2) + x77*x78*x79*x80*x81*a(3) + x82*x83*x84*a
(4) + x85*x86*x87*x88*x89*a(5) + x90*a(6) + x91*x92*a(7) + x93*x94*x95*x96*x97*a
(8) + x100*x101*x98*x99*a(9) + x102*x103*x104*x105*x106*a(10))*(x107*x108*a(1) +
x109*x110*a(2) + x111*x112*a(3) + x113*a(4) + x114*x115*x116*x117*a(5) + x118*x
119*a(6) + x120*x121*a(7) + x122*x123*x124*a(8) + x125*a(9) + x126*x127*x128*x12
9*a(10))*(x130*x131*x132*a(1) + x133*a(2) + x134*x135*a(3) + x136*x137*x138*x139
*x140*a(4) + x141*x142*x143*x144*x145*a(5) + x146*x147*x148*x149*x150*a(6) + x15
1*x152*x153*x154*x155*a(7) + x156*a(8) + x157*x158*a(9) + x159*x160*x161*a(10))*
(x162*a(1) + x163*a(2) + x164*x165*a(3) + x166*a(4) + x167*a(5) + x168*x169*x170
*x171*a(6) + x172*x173*x174*x175*a(7) + x176*a(8) + x177*x178*x179*x180*x181*a(9
) + x182*a(10))*(x11*x12*x13*a(1) + x14*x15*a(2) + x16*x17*x18*x19*x20*a(3) + x2
1*x22*x23*a(4) + x24*x25*x26*x27*x28*a(5) + x29*x30*a(6) + x31*x32*x33*a(7) + x3
4*x35*x36*x37*x38*a(8) + x39*a(9) + x40*x41*x42*a(10))*(x43*x44*x45*x46*a(1) + x
47*x48*x49*x50*x51*a(2) + x52*x53*a(3) + x54*x55*x56*x57*a(4) + x58*x59*x60*x61*
x62*a(5) + x63*x64*a(6) + x65*a(7) + x66*x67*a(8) + x68*x69*x70*x71*a(9) + x72*a
(10));
.sort
Time = 1.08 sec Generated terms = 519379
expr 1 Terms left = 519379
Bytes used = 84983048
Time = 2.05 sec Generated terms = 1000000
expr 1 Terms left = 1000000
Bytes used = 156406260
Time = 2.42 sec Generated terms = 1000000
expr Terms in output = 1000000
Bytes used = 154359092
Format Mathematica;
#write <expr-expanded.m> "\"(%E)\"",expr
.end
Time = 6.43 sec Generated terms = 537298
expr 537299 Terms left = 537298
Bytes used = 81653732
Time = 6.77 sec Generated terms = 1000000
expr 1000000 Terms left = 1000000
Bytes used = 154359216
Time = 7.00 sec Generated terms = 1000000
expr Terms in output = 1000000
Bytes used = 154359092
7.00 sec out of 7.01 sec
form formExpand.frm 4,58s user 2,48s system 99% cpu 7,071 total
we see that FORM needs only 2.4 seconds for the expansion. But since you probably want to continue processing the expressions with Mathematica and not FORM, we need to export it into Mathematica format and then parse it from Mathematica. Here we loose about 4.2 seconds for writing the output to disk (the file is over 100 MB large!)
Then loading the result
AbsoluteTiming[
res = Import["expr-expanded.m", "String"];
res = ToExpression[res];]
still requires about 3 seconds. So there is a caveat here: If you do everything with FORM you can gain performance, but exchanging results between FORM and Mathematica costs time and should be, in general, minimized.
Of course, FORM has its own binary format for saving and loading results in an efficient way. Unfortunately, Mathematica cannot read that.
Notice also that if you try to naively parse the full expression directly without the String
/ToExpression
detour, you will loose more time. With
#write <expr-expanded.m> "(%E)",expr
doing
AbsoluteTiming[res = Get["expr-expanded.m"];]
requires about 20 seconds!
Regarding FORM and Windows, you definitely need to install Cygwin to get it compiled. If you completely fail I can have a look at it later.
The question probably should be marked as a duplicate since it appears periodically, because Expand is so fundamental operation. And as it was noted by many peoples (including vsht in the comment above) we hardly can hope to beat it. However after implementing it in non commutative algebra (and after noting this question) I became myself curious how close we could get it for commutative algebra. The approach uses ListConvolve procedure from Carl Woll mentioned in my comment above.
First are the original data
fac[n_] :=
Sum[Product[Unique[x], {i, 1, RandomInteger[{1, 5}]}] a[i], {i, 1, n}]
exprInput = fac[#] & /@ ConstantArray[10, 6];
Next I define very restrictive functions for conversion to/from new Association structure
toAssoc[expr_Plus] :=
Map[ToString,
Association[
MapAt[ToString,
Thread[((List @@ expr) /. a[_] :> 1) -> ((List @@ expr) /.
Thread[Rule[DeleteCases[Variables[expr], a[_]], 1]])], {All,
1}]]]
fromAssoc[expr_Association] :=
Plus @@ KeyValueMap[ToExpression[StringJoin[#1, " ", #2]] &, expr]
fromAssoc[toAssoc[exprInput[[1]]]] - exprInput[[1]]
(* 0*)
modifiedInput = toAssoc /@ exprInput;
Next comes multiplication of single pair
myFastProductPairExpand[a_Association, b_Association] :=
Merge[ListConvolve[Keys[a], Keys[b], {1, -1}, 0,
StringJoin[#1, " ", #2] -> StringJoin[a[#1], " ", b[#2]] &,
DeleteCases[List[##], 0] &], Total]
And formal expansion:
answerAssoc =
myFastProductPairExpand[modifiedInput[[1]],
myFastProductPairExpand[modifiedInput[[2]],
myFastProductPairExpand[modifiedInput[[3]],
myFastProductPairExpand[modifiedInput[[4]],
myFastProductPairExpand[modifiedInput[[5]],
modifiedInput[[6]]]]]]]; // AbsoluteTiming
(* 11.9 *)
To compare against
origInput = Times @@ exprInput;
answerExpand = origInput // Expand; // AbsoluteTiming
(* 10. 6 *)
and check
fromAssoc[answerAssoc] - answerExpand
(*0*)
Of course, without taking into account of fromAssoc time.
Note that in ListConvolve we keep the convolution kernel as small as possible. If one reverses arguments, the computation time increases drastically. In order to get this speed I also used the fact that there are no powers of variables in the input and output. Otherwise we would need to implement some ordering of product keys, which would take additional time. Of course, variant with List instead of String in keys and values can be implemented (probably slower).
A slower version of similar procedure, which uses ordinary Times both for keys and values is given below. It is also more universal and do not require sorting of keys, since the job is done by Times itself.
toAssoc2[expr_Plus] :=
Association[
MapAt[Times,
Thread[((List @@ expr) /. a[_] :> 1) -> ((List @@ expr) /.
Thread[Rule[DeleteCases[Variables[expr], a[_]], 1]])], {All,
1}]]
fromAssoc2[expr_Association] := Plus @@ Times @@@ (Normal[expr])
modifiedInput2 = toAssoc2 /@ exprInput
myFastProductPairExpand2[a_Association, b_Association] :=
Merge[ListConvolve[Keys[a], Keys[b], {1, -1},
0, (#1*#2 -> a[#1]*b[#2]) &, DeleteCases[List[##], 0] &], Total]
answerAssoc2 =
myFastProductPairExpand2[modifiedInput2[[1]],
myFastProductPairExpand2[modifiedInput2[[2]],
myFastProductPairExpand2[modifiedInput2[[3]],
myFastProductPairExpand2[modifiedInput2[[4]],
myFastProductPairExpand2[modifiedInput2[[5]],
modifiedInput2[[6]]]]]]]; // AbsoluteTiming
(* 27.1 *)
fromAssoc2[answerAssoc2] - answerExpand
(* 0*)
So with ListConvolve + Associations one can implement very fast purely functional "expansion" procedure. It is still, however, slower than Expand. May be somebody could optimize the above idea further.