What is the best way to track the gradient to a local minimum?

From the tutorial Line Search Methods, there is an example similar to this:

Newton's method effectively uses a quadratic model and solves the equation $$H {\bar s} = - \nabla f(x,y)\,,$$ where $H$ is the Hessian $H = \nabla^2f(x,y)$, for the step $\bar s=(\Delta x, \Delta y)$. For an objective function that is a quadratic function like x^2 + 10 y^2, this will solve exactly for the minimum. With a step size reduction effected by "MaxRelativeStepSize" -> .1, this results in a straight-line trajectory toward the origin. To get a true gradient step, we need to set the Hessian to the identity matrix, which can be done via the Hessian option.

This shows the difference on the OP's original problem between the default Newton's method (purple) and the naive gradient descent method (red).

Show[
 ContourPlot[x^2 + y^4, {x, -0.5, 10.5}, {y, -0.5, 10.5}, 
  Contours -> 2^Range[15]],
 ListLinePlot[
  Last[Reap[
    FindMinimum[x^2 + y^4, {x, 10}, {y, 10}, 
     Method -> {"Newton", 
       "StepControl" -> {"LineSearch", "MaxRelativeStepSize" -> .1}},
     EvaluationMonitor :> Sow[{x, y}]]]], PlotRange -> All, 
  PlotMarkers -> {Automatic, 10}, PlotStyle -> Purple],
 ListLinePlot[
  Last[Reap[
    FindMinimum[x^2 + y^4, {x, 10}, {y, 10}, 
     Method -> {"Newton", Hessian -> {{1, 0}, {0, 1}},    (* <-- override Hessian *)
       "StepControl" -> {"LineSearch", "MaxRelativeStepSize" -> .1}},
     EvaluationMonitor :> Sow[{x, y}], MaxIterations -> 200]]], 
  PlotRange -> All, PlotMarkers -> {Automatic, 10}, PlotStyle -> Red]
 ]

Mathematica graphics

Note the gradient descent does not "converge to the requested accuracy or precision within 200 iterations" (FindMinimum::cvmit).


You can use FindRoot to find zeros of the derivative. FindRoot as an option DampingFactor that should serve your purpose:

ListLinePlot[
 Last[Reap[
   FindRoot[{D[x^2 + y^4, x] == 0, 
     D[x^2 + y^4, y] == 0}, {{x, 10}, {y, 10}}, DampingFactor -> .1, 
    EvaluationMonitor :> Sow[{x, y}]]]], PlotRange -> All, 
 PlotMarkers -> {Automatic, 10}]

enter image description here

Update:

The algorithm for FindMinimum and FindRoot is Newton by default, not a gradient method. If one wants that, it's best to implement it yourself:

f[x_, y_] := x^2 + y^4
list = Module[{d = 10^-3., steps = 10000, x = 10, y = 10, dx, dy, xn, 
    yn, xp, yp},
   xn = x;
   yn = y;
   Table[
    x = xn;
    y = yn;
    dx = D[f[xp, yp], xp] /. {xp -> x, yp -> y};
    dy = D[f[xp, yp], yp] /. {xp -> x, yp -> y};
    xn = x - d*dx;
    yn = y - d*dy;
    {x, y, dx, dy},
    {i, 0, steps}]];

ListLinePlot[list[[All, {1, 2}]], PlotRange -> All, PlotMarkers -> {Automatic, 10}]

enter image description here

Of course, there is plenty of room for improving the algorithm and making it more robust / choosing a smarter step size.


Just for fun, here is a version using NDSolve:

sol = NDSolve[
    {
        x'[t]==-2x[t] Exp[t],
        y'[t]==-4y[t]^3 Exp[t],
        x[0]==10,
        y[0]==10, 
        WhenEvent[x'[t]^2 + y'[t]^2 < 10^-10, end=t; "StopIntegration"]
    },
    {x,y},
    {t,0,Infinity}
];

Note that I added a scaling factor (Exp[t]) so that the t range is not too large. Here's a plot of the result:

Show[
    ContourPlot[x^2 + y^4, {x, -1, 11}, {y, -1, 11}, Contours->2^Range[15]],
    ParametricPlot[{x[t], y[t]} /. sol, {t, 0, end}]
]

enter image description here