Proof that gradient is orthogonal to level set

If $\nabla f({\bf x}_0) \not= {\bf 0}$, then the Jacobian of $f$ (i.e. $\nabla f$) has maximal rank at ${\bf x}_0$. This means the implicit function theorem can be applied so that $\{ {\bf x} \in \mathbb{R}^{n} \,|\, f({\bf x})={\bf c} \}$ is a submanifold of $\mathbb{R}^n$. This means that about each point in the level set there is a diffeomorphism between a neighborhood of that point and an open set in $\mathbb{R}^{n-1}$.

At this point, we know the level set has a well defined tangent space. There are $n-1$ curves whose tangent vectors are linearly independent. Then we can apply the standard argument to each of these curves. Using the chain rule, we have $f({\bf r}(t))={\bf c}$ $\Rightarrow$ $\nabla f({\bf r}(t)) {\bf \cdot} {\bf r}'(t) = 0$. So the gradient is orthogonal to each tangent and thus is orthogonal to the level set.

So you are correct. The implicit function theorem is being used to guarantee that the curves we need actually exist.

Edit: A few more details.

Take a point on the level surface, say ${\bf x}_0 = (x_1,\dots,x_{n-1},y_0)=({\bf z}_0,y_0)$. Suppose that $\nabla f({\bf x}_0) \not=0$. For convenience, suppose that the last component of the gradient is non-zero.

Then there exists a region $D$ in $\mathbb{R}^{n-1}$ of points "close to" ${\bf z}_0$ such that $g(t_1,\dots,t_{n-1})$ is a function from $D$ to $\mathbb{R}$ and $f(t_1,\dots,t_{n-1},g(t_1,\dots,t_{n-1}))={\bf c}$ for all $(t_1,\dots,t_{n-1})$ in $D$ [This is the implicit function theorem in action. It allowed us to "solve" for the last variable in terms of the others.] Now we can define ${\bf r}_i(t)=(x_1,\dots,x_{i-1},t,x_{i+1},\dots,x_{n-1},g(x_1,\dots,x_{i-1},t,x_{i+1},\dots,x_{n-1}))$. We have ${\bf r}_i(x_i)={\bf x}_0$ and $f({\bf r}_i(t))={\bf c}$. This gives us $n-1$ curves on our level set.