How to explain lagrange multipliers to a lay audience?
If you just need to convey the basic intuition about the use of Lagrange multipliers in optimization and you are really worried about keeping your audience engaged, you might want to rely on their every-day intuition about level curves using this kind of drawing
(here is the .svg of the picture if you want to generate variants of it. You should probably edit it with Latexdraw 2.0 if you want to be able to alter the path, the gradient, or anything which is not in the background)
- You start by giving them the intuition that "you are at a critical point only if the path you are walking on does not cross a level curve, but rather is tangent to one".
- Then you explain how this can be checked using gradients.
- Finally, you try to make them understand that this is equivalent to the existence of a Lagrange multiplier.
People might be a little puzzled by the last step. But you can hope that most of them will catch the two first steps. Even if they don't get the last one, they may remember that the existence of the multiplier is a convenient way to check that we are at a critical point based on the algebraic formulation of the function.
Here's a terse attempt to convey the main mathematical idea geometrically.
If $f$ is a function of several independent variables, there is an associated gradient vector field $\nabla f$, made up of the partial derivatives of $f$, that has the following property: If $x_{0}$ is a point of the domain and $v$ is a unit vector, the directional derivative $$ \frac{d}{dt}\bigg|_{t=0} f(x_{0} + tv) = \nabla f(x_{0}) \cdot v $$ measures the rate of change of $f$ at $x_{0}$ in the direction $v$.
If $\nabla f(x_{0})$ is non-zero, and if $v$ makes angle $\theta$ with the gradient, the directional derivative reduces to $\|\nabla f(x_{0})\| \cos\theta$. In particular, traveling along the gradient field ($\theta = 0$) causes $f$ to increase "as rapidly as possible", and traveling orthogonal to the gradient $(\theta = \pi/2$) keeps $f$ constant to first order. Geometrically, the gradient field is orthogonal to the level sets of $f$.
Now, suppose the variables are subject to a smooth constraint of the form $g(x) = c$, and we want to find local extrema of $f$. A necessary condition is that $\nabla f$ be orthogonal to the constraint, namely, that $\nabla f$ be parallel to $\nabla g$, or that $\nabla f = \lambda \nabla g$ for some scalar $\lambda$. Indeed, if this is not the case, then the projection of $\nabla f$ onto the tangent space of the constraint set gives a direction in which $f$ increases.
The diagram shows a linear function $f(x, y) = ax + by$ subject to a constraint $x^{2} + y^{2} = c$. Here $\nabla f = (a, b)$ is constant, $\nabla g = (2x, 2y)$, and the constrained extrema of $f$ occur at the points where $(a, b)$ is perpendicular to the circle.