How to differentiate formally?

Here is the simplest answer:

sum[n_] := Sum[i x[i], {i, 1, n}]

x /: D[x[i_], x[j_], NonConstants -> {x}] := 
 KroneckerDelta[i, j]

D[sum[n], x[2], NonConstants -> x]

$\begin{cases} 2 & n>1 \\ 1-n & \text{True} \end{cases}$

The trick here is the use of the NonConstants option of the derivative operator. This then has to be combined with a definition stating that the x[i] are independent variables for the purposes of this differentiation (hence the KroneckerDelta on the second line).

Edit: more discussion

And here is another cool result, completely symbolic:

Assuming[m ∈ Integers, D[sum[n], x[m], NonConstants -> x]]

$\left( \begin{array}{cc} \{ & \begin{array}{cc} m & m\geq 1 \\ 0 & \text{True} \\ \end{array} \\ \end{array} \right)-\left( \begin{array}{cc} \{ & \begin{array}{cc} m & m>2\land m\geq n+1 \\ n+1 & m=2\land n=1 \\ \end{array} \\ \end{array} \right)$

This isn't easy to absorb, but it works if you check it with specific examples by doing

condition = %;

Simplify[condition /. m -> 10]

$\begin{cases} 10 & n>9 \\ 0 & \text{True} \end{cases}$

In summary, it's worth pointing out that a lot of symbolic differentiation tasks can be achieved by using either NonConstants specifications in D or conversely using Constants specifications in Dt.

I did some computation of formal derivatives a while back which might be of interest in this context (though keep in mind that this is anything but bullet proof! it does work for the cases I bothered to check though).

Clear[a]; Format[a[k_]] = Subscript[a, k]

Let us say we have an objective function which is formally a function of the vector a[i]

Q = Sum[Log[Sum[a[r] Subscript[B, r][Subscript[x, i]], {r, 1, p}]/
    Sum[a[r] , {r, 1, p}]], {i, 1, n}]

Mathematica graphics

Let us define a couple of rules for formal differentiation as follows

Clear[d];
d[Log[x_], a[k_]] := 1/x d[x, a[k]]
d[Sum[x_, y__], a[k_]] := Sum[d[x, a[k]], y]
d[ a[k_] b_., a[k_]] := b /; FreeQ[b, a]
d[ a[q_] b_., a[k_]] := b Subscript[δ, k, q] /; FreeQ[b, a]
d[ c_  b_, a[k_]] := d[c, a[k]] b + d[b, a[k]] c
d[ b_ +  c_, a[k_]] := d[c, a[k]] + d[b, a[k]]
d[Subscript[δ, r_, q_], a[k_]] := 0
d[x_, a[k_]] := 0 /; FreeQ[x, a]
 d[G_^n_, a[k_]] := n G^(n - 1) d[G , a[k]] /; ! FreeQ[G, a]
 d[Exp[G_], a[q_]] := Exp[G] d[G , a[q]] /; ! FreeQ[G, a]

Unprotect[Sum]; Attributes[Sum] = {ReadProtected};Protect[Sum];

And a rule to deal with Kroneckers

ds = {Sum[a_ + b_, {s_, 1, p_}] :> Sum[a, {s, 1, p}] + Sum[b, {s, 1, p}],
      Sum[ y_ Subscript[δ, r_, s_], {s_, 1, p_}] :> (y /. s -> r),
  Sum[ y_ Subscript[δ, s_, r_], {s_, 1, p_}] :> (y /. s -> r),
  Sum[ Subscript[δ, s_, r_], {r_, 1, p_}] :> 1,
  Sum[δ[i_, k_] δ[j_, k_] y_. , {k_, n_}] -> δ[i, j] (y /. k -> i),
  Sum[a_ b_, {r_, 1, p_}] :> a Sum[b, {r, 1, p}] /; NumberQ[a],
  Sum[a__, {r_, 1, p_}] :>  Sum[Simplify[a], {r, 1, p}] }

Then, for instance, the gradient of Q with respect to one of the a[k] reads

grad = d[Q, a[k]]  /. ds // Simplify;

Mathematica graphics

Similarly the tensor of second derivatives w.r.t. a[k] and a[s] is given by

hess = d[d[Q, a[k]], a[s]] /. ds // Simplify

Mathematica graphics

As a less trivial example let us consider the 4th order derivatives of Q

 d[d[d[d[Q, a[k]], a[s]], a[m]], a[t]]; /. ds // Simplify

Mathematica graphics

For the problem at hand we check easily that

 Q = Sum[r a[r] , {r, 1, p}];

 grad = d[Q, a[k]] // Simplify;
 grad //. ds

returns k as it should

EDIT

This process can be made a bit more general, say, on this Objective function

  Q = 1/2 Sum[(Sum[a[r] Subscript[B, r, i][a[q]], {r, 1, p}] - 
      Subscript[y, i])^2, {i, 1, n}]

Mathematica graphics

which depends non linearly on a[k] via B.

All we need is to add a new rule for d

d[H_[a[q_]], 
  a[k_]] := (D[H[x] , x] /. x -> a[k] ) Subscript[δ, k, q]

Now we readily have

grad = d[Q, a[k]] // Simplify;
hess = d[d[Q, a[k]], a[s]];

grad //. ds

Mathematica graphics

hess /. ds // Simplify

Mathematica graphics

As a other example, let us look at a parametrized entropy distance,

 Q = -Sum[(Sum[a[r] Subscript[B, r, i], {r, 1, p}]/
  Subscript[y, i]) Log[(Sum[a[r] Subscript[B, r, i], {r, 1, p}]/
   Subscript[y, i])], {i, 1, n}]

Mathematica graphics

we can compute its Hessian while mapping twice the sum rule

 Map[# /. ds &, d[d[Q, a[k]], a[s]] /. ds]

Mathematica graphics

As a final example, consider a Poisson likelihood

 Q = Sum[Log[Exp[-a[k]] a[k]^Subscript[y, k]/Subscript[y, k]!], {k, 1, n}]

Mathematica graphics

so that

 grad = d[Q, a[k]] // Simplify

Mathematica graphics

and

  hess =d[d[Q, a[k]], a[s]]  /. ds // Simplify

Mathematica graphics

Of course these algebraic rules are not bullet proof but illustrate nicely the way mathematica handles new grammar.

Starting in M11.1, this works:

sum[n_] = Sum[i x[i],{i,1,n}];

D[sum[n],x[2]] //InputForm

Piecewise[{{2, n >= 2}}, 0]

How to differentiate formally?

Tags:

Attributes

Variable Definitions

Summation

Related

Recent Posts