Property of Dirac delta function in $\mathbb{R}^n$
Try replacing $\delta(x)$ with $\varphi_\epsilon(x)=\varphi(x/\epsilon)/\epsilon$, where $\varphi$ is a positive function of compact support and whose integral is $1$. For such $\varphi$, $\lim_{\epsilon\to 0}\;\varphi_\epsilon\to\delta$ in the sense of distributions. Near points $\pmb{r}\in S$, $g(\pmb{x})=(\pmb{x}-\pmb{r})\cdot \nabla g(\pmb{r})+o(\pmb{x}-\pmb{r})$.
On $S$, $\nabla g=\pmb{n}|\nabla g|$, where $\pmb{n}$ is the surface normal to $S$. So near $\pmb{r}\in S$, $$ \begin{align} \varphi_\epsilon(g(\pmb{x}))&=\varphi((\pmb{x}-\pmb{r})\cdot \nabla g(\pmb{r})/\epsilon)/\epsilon+o(\pmb{x}-\pmb{r})\\ &=\varphi((\pmb{x}-\pmb{r})\cdot \pmb{n}/\epsilon')/\epsilon'/|\nabla g(\pmb{r})|+o(\pmb{x}-\pmb{r})\\ &=\varphi_{\epsilon'}((\pmb{x}-\pmb{r})\cdot \pmb{n})/|\nabla g(\pmb{r})|+o(\pmb{x}-\pmb{r}) \end{align} $$ where $\varphi_{\epsilon'}((\pmb{x}-\pmb{r})\cdot \pmb{n})$ is an approximation of surface measure on $S$ near $\pmb{r}$.
Thus, $\delta(g(\pmb{r}))\;d\pmb{r}=\;\displaystyle{\frac{d\sigma}{|\nabla g(\pmb{r})|}}$ where $d\sigma$ is surface measure on $S$.
By Taylor series $g(\mathbf{x}) = g(\mathbf{r}) + \vec{\mathrm{grad} g(\mathbf{r})}.(\mathbf{x}-\mathbf{r}) + o(\vert \mathbf{x}-\mathbf{r} \vert)$ as a new coordinate in the vicinity of the surface, where $g(\mathbf{r})=0$. Change basis using $\mathbf{n}_1 = \frac{\vec{\mathrm{grad} g(\mathbf{r})}}{\vert{\mathrm{grad} g(\mathbf{r})}\vert}$ as a first vector, and remaining $\mathbf{n}_i$ for $i=2, \ldots, n$ are chosen by Gram orthogonalization procedure. Let $t_i$ be coordinates in this system, $\mathbf{r} = \sum_i t_i \mathbf{n}_i$. Then $dV_x = dx_1 \wedge d x_2 \wedge \ldots \wedge d x_n = \vert J \vert dt_1 \wedge d t_2 \wedge \ldots \wedge d t_n = dV_t$.
$$ \int f(\mathbf{r}) \delta( g(\mathbf{r})) dV_x = \int f(\mathbf{r}) \delta( \vert \mathrm{grad} g(\mathbf{r}) \vert t_1 ) dV_t = \int f(\mathbf{r}) \frac{1}{\vert \mathrm{grad} g(\mathbf{r}) \vert }\delta( t_1 ) dV_t $$
Integration overt $t_1$ produces $d \sigma$.
This is a little hand-wavy, but gives you an idea.
What you are quoting is a general statement about pull-backs of distributions. Since I am not entirely sure of your background, I won't try to give a detailed explanation here. Rather, I will refer you to Chapter 7 of Friedlander and Joshi's Introduction to the Theory of Distributions.