Effective matrix power like algorithm

Here's a functional implementation of the binary method. The Fold operates only Log[2,n] times, performing either one or two "multiplications" each time.

power[f_][a_, n_Integer] := With[{b = Reverse@IntegerDigits[n, 2]},
  Last@Fold[
    With[{p = f[#1[[1]], #1[[1]]]}, {p, 
       If[#2 == 1, 
        If[#1[[2]] === Null, p, f[p, #1[[2]]]], #1[[2]]]}] &,
    {a, If[First@b == 1, a, Null]}, Rest@b]]

Here's an example, using a pure function which logs the multiplications it is asked to perform as well as actually doing them. Of course, f could be any associative dyadic operation, and a could be any suitable expression, not just a scalar as shown.

power[(Print["(", #1, "\[Times]", #2, ")"]; Times[#1, #2]) &][a, 6]

This will output $a^6$, but also print three lines $(a \times a)$, $(a^2 \times a^2)$ and $(a^4 \times a^2)$.

I've used Null as a placeholder to mean "no result yet". One could certainly add an argument to specify the actual identity element of f instead. Special-casing the least significant bit of n avoids doing an extra unnecessary squaring of the $a^{2^k}$ at the very end.


Note: The method I describe below does not find the optimal solution (i.e. the minimal number of multiplications), but it usually does better than the binary approach, and more importantly: it is very easy to extend it and optimize it for special cases. We could easily pre-compute those exponents for which it does worse than the simple binary method to say up to 1000 and special case those.

According to the Wikipedia article on the topic:

... the determination of a shortest addition chain seems quite difficult: no efficient optimal methods are currently known for arbitrary exponents, and the related problem of finding a shortest addition chain for a given set of exponents has been proven NP-complete.

Therefore trying to compute the optimial solution within the function (as opposed to precomputing it and making a lookup table) defeats the purpose of using this for optimization.


I did a version of this with C++ template metaprogramming here.

This is a direct translation:

Let f be an associative function, then:

Clear[pow]
(* Syntax: pow[exponent][base] *)
pow[n_][a_] /; Mod[n, 2] == 0 := With[{t = pow[n/2][a]}, f[t, t]]
pow[n_][a_] /; Mod[n, 3] == 0 := With[{t = pow[n/3][a]}, f[f[t, t], t]]
pow[n_][a_] := f[pow[n - 1][a], a]
pow[1][a_] := a

Testing:

f[x_, y_] := (Print["x"]; x y)

Then pow[10][2] gives

During evaluation of In[389]:= x

During evaluation of In[389]:= x

During evaluation of In[389]:= x

During evaluation of In[389]:= x

Out[389]= 1024

i.e. it was done in four multiplications. You can think a bit about which is the best subdivision (to 3 or to 2) for different numbers. It can be manually verified that for exponents under 100, in the case of 33 and 69 (and exponents reducible to these such as $67 = 2\times 33 +1$) it's better not to subdivide into 3 first. We can easily special-case pow for these by adding the following to the beginning of pow's definition:

pow[33][a_] := f[pow[32][a], a]
pow[69][a_] := f[pow[68][a], a]

Similarly, in the case of e.g. 82 it's better not to subdivide into 2 because $82 - 1 = 3^4$, but use

pow[82][a_] := f[pow[81][a], a]

Another example is that in the case of 85 and 95 it's better to subdivide into 5 first:

pow[85][a_] := pow[13][ pow[5][a] ]

But most of these are just tweaks that will not make a huge difference.

I believe that the value of my implementation lies in the ease of its extension for special cases like these for small exponents.


For the adventurous with a lot of free time, see this paper on Pippenger's algorithm.


The answer depends on the structure of the algebra.

I therefore cannot give a universal answer. Instead, I will illustrate the structural dependence by means of examples. In response to the request for implementation, they also illustrate one possible way to exploit Mathematica's support for symbolic computations. (I welcome comments suggesting clearer or more efficient approaches.)

Take, for instance, the second example of the question. Here is an abbreviated implementation of the wedge product of alternating bilinear forms, enough to get us going:

SetAttributes[Wedge, Flat];
Wedge[element[ω___], element[η___]]  :=
   With[{ϕ = Flatten[{ω, η}]},
    signature[ϕ]  (element @@  Union[ϕ])
    ];
Wedge /: Wedge[Times[x_, ω_element], y_] := Times[x, Wedge[ω, y]];
Wedge /: Wedge[x_, Times[y_, ω_element]] := Times[y, Wedge[x, ω]];
Wedge[ω_, 0] := 0;
Wedge[0, ω_] := 0;

It requires calculation of the signature of a permutation of arbitrary sortable objects, extended to have the value $0$ whenever there's a repetition. Here's a reasonably efficient implementation:

signature[p_List] := 
  With[{c =  First[PermutationCycles[Ordering[p]] /. Cycles -> List]},
   (-1)^(Length[Flatten[c]] - Length[c])
   ];
signature[p_List] /; Length[Union[p]] != Length[p] := 0;

More generally, in an arbitrary algebra the product will be computed by means of a matrix of structure constants specifying the product of any two elements in a given basis. The foregoing is a programmatically simple way to provide those structure constants for $\Lambda(V)$ for any finite-dimensional vector space $V$ (note that the dimension of $\Lambda(V^n)$ is $2^n$.) For efficiency (in low dimensions, anyway) we could compute these structure constants once and for all and cache them.

The trick to computing powers in algebras with lots of zeros and other symmetries in their structure matrices is to simplify as you go, as here:

Clear[WedgePower];
WedgePower[ω_, n_] /; n >= 2 :=  
  Nest[Distribute[Wedge[ω, #], Plus, Wedge] &, ω, n - 1];
WedgePower[ω_, 1] := ω;
WedgePower[ω_, 0] := 1;

Distribute is implicitly doing the work of assembling pairs of basis elements to be multiplied and then collecting comparable terms, such as $dx_1\wedge dx_2 = -dx_2\wedge dx_1$, and summing their coefficients. As this is built in to Mathematica, we can expect it to be reasonably efficient.

Take, for instance, an arbitrary 2-form $\omega$ in $\Lambda(V^{10})$. It will have up to $\binom{10}{2} = 45$ nonzero coefficients; its square will have up to $\binom{10}{4} = 210$ nonzero coefficients; its cube, $210$; its fourth power, $45$; and its fifth power must be a multiple of a single form. Because you're working in an algebra of dimension $2^{10}=1024$, in principle each successive product requires $(2^{10})^2$, or over a million, calculations (although most of them will be products of zeros). Here, the bilinearity hugely reduces that number: the calculations needed to compute generic squares, cubes, fourth, and fifth powers of a 2-form are $2025$, $11475$, $20925$, and $22950$, respectively.

Trying to save time by breaking up powers into smaller groups (via the "binary method" or otherwise) can backfire. Continuing this example, computing $\omega^4$ via successive squaring requires first $2025$ operations to compute $\omega^2$ and then $44100$ to square that, a total of $46125$ products. Compare this to the value of $20925$ needed for successive wedging by $\omega$ itself: it will take more than twice as much effort.

By choosing an appropriate basis, the Liouville form can be written in a special way as $p_1 \wedge q_1 + \cdots + p_k \wedge q_k$. This reduces the calculations still further. With $k=5$ (working once more in a ten-dimensional space), for instance, the powers require only 25, 50, 50, and 25 wedge operations, respectively. This suggests yet another line of attack: before computing powers of an element of an algebra, change the basis if possible to simplify the calculation. Once again, though, how this is done will depend on the algebraic structure.