Why is gradient in the direction of ascent but not descent?
The comments persuaded me to reformulate my answer. For the original (still correct, but sub-optimal) version, see below.
The gradient is defined in a completely natural way. There is no completely mathematical reason, why it can be said to point to the steepest ascent. It has more to do with some more or less arbitrary choices being made in several definitions, which break this symmetry.
Observation. The concept "gradient points in direction of ascent" also works for single-valued functions $\Bbb R\to\Bbb R$. There is indeed a concept of direction in $\Bbb R$: right and left. A positive derivative is a vector (the gradient) pointing to the right (in the direction of ascent), a negative derivative is a vector pointing to the left (in this case, also the direction of ascent, because the function is decreasing to the right). So since the same observation also applies in 1D, we should start looking for an explanation here.
Note: I am going to use the terms "right" and "left" for the direction "positive" and "negative" on the number-line, because this is the standard orientation of the number-line. This is also a symmetry break, but only a notational one. It does not effect the mathematics in any way if we flip these directions.
Thanks to the comments, some few definitions could be localized as the rootcause of the broken symmetry. If you are standing on a mountain side, there is no meaning in asking whether it is up-hill or down-hill. This question only makes sense if you define a direction with respect to which we should judge the slope. The same goes for single-valued functions. It has been standard to call a function increasing if it's function graph is uphill to the right. This involves two arbitrary choices:
- The $y$-axis is pointing upwards, hence increasing function values are seen as going up. This is the most obvious arbitrary choice. Many applications do it the other way around, e.g. line-numbers in text are increasing from top to bottom, and pixels on a screen are usually adressed with an downwards-increasing $y$-axis.
- The kind of slope is judged w.r.t. to the "arbitrary" direction "right". Why not left? It seems natural, but is not forcing.
There might be another arbitrary choice: a positive derivative indicates that the function is increasing. We could have defined it the other way around. Anyways, flipping any single of these definitions will change the gradient from pointing upwards to pointing downwards.
Note. Yes I know, "increasing" is formally defined as $x\le y\implies f(x)\le f(y)$, but also this definition is motivated by the visualization of an increasing function graph to the right. No one would use it if the $y$-axis was pointing downwards.
Conclusion: The reason for the gradient pointing to the steepest ascent is based in our somewhat biased definitions. This is especially evident in the 1D-case. The derivative is defined in such a way so that it has a positive value (the gradient points to the right) if the function increases. A function is called increasing if its function graph is going uphill to the right. You see how these arbitrary definitions combine to "gradient is pointing uphill".
ORIGINAL
Because we have a somehow biased definition of differentiation. Let me explain.
As noted in a comment, this "gradient pointing in direction of ascent" also works for single-valued functions $\Bbb R\to\Bbb R$. There is indeed a concept of direction in $\Bbb R$: left and right. A positive derivative is a vector (the gradient) pointing to the right (in the direction of ascent), a negative derivative is a vector pointing to the left (also in the direction of ascent, because the function is decreasing). So since the same observation happens in $1$D, we should probably start there looking for an answer.
It all happens because the definition of derivative is biased in some sense. Someone once decided that a function is considered increasing if its value gets bigger to the right. You see the broken symmetry? Why to the right, why not to the left? So once one decided that the derivative is positive $-$ the gradient points to the right $-$ when the function grows to the right. Here we have it. Whoever defined it, directly coupled the terms "direction of gradient" and "direction of acsent".
Would he had decided to define a function as increasing if its value grows to the left (unnatural, considering our left-to-right reading direction), then the gradient would point to the steepest decent.
Note: In this answer I assumed that the number line is oriented with the positive number on the right. This is standard, but another symmetry break (but only a notational one, without impact on the mathematics). You can substitute all left/right above by negative/positive if you want to be indepedent of this broken symmetry.
The gradient with regard to some input variable indicates how much the value of the output variable goes up (i.e. ascends) when that input variable goes up. As such, if you move in the direction of the gradient, and the gradient is positive, then the value of the output variable will go up. If the gradient is negative, however, the increasing the input variable will decrease the output variable. But yes, a positive gradient means that you will ascend if you 'follow' the gradient.
This is our definition of gradients or slopes, and works just the same in $\mathbb{R}$ as in $\mathbb{R}^n$. That is, when we take the derivative, the derivative $\frac{dy}{dx}$ indicates to what extent $y$ increases as $x$ increases which, by the way, is the same as the extent to which $y$ decreases as $x$ decreases.
Now, some answers suggest that when we defined the gradient in this manner there was a bit of 'arbitrariness' involved, and that we could have defined the gradient differently so that the direction would reverse. It is even suggested that this has something to do with 'right' being arbitrarily chosen as the 'up' direction and 'left' being 'down'.
However, I strongly disagree with those answers, because the alternative would have been to say that the gradient would be the extent to which what extent $y$ increases as $x$ decreases .. which would be a very confusing and unnatural thing to do; it's like bringing in an extraneous negative or reversal.
Anyway, when you want the output value to descend, you should go in the 'other' direction of the gradient, i.e. subtract a value proportional to that gradient. Of course, that does mean that if the gradient is negative, you end up increasing the input variable in order to decrease the output variable.
"There is no concept of direction for the single-variable function as obvious."
False.
When the univariate derivative is positive, the function increases in the direction of increasing input; when negative, in the direction of decreasing input. There are only two possible directions, but whichever it is, the derivative does point in the direction in which the function increases.