Precisely, what is a primitive recursive definition?
One way to define what a "primitive recursive definition" is is to make use of some higher type objects. We will call them $C$ and $R$ for our purposes. There is also a unary successor function $S(x)$, a unary zero function $Z(x)$, and for each $i \leq j$ there is a projection function $$ \pi^j_i(x_1,\ldots,x_j) = x_i $$
We will define functions using the symbols "$C$", "$R$", "$S$", "$Z$", and "$\pi^j_i$" for all $i \leq j$, and two more symbols "(" and ")". This is an infinite alphabet of symbols. The definitions of primitive recursive functions will be a particular set of finite words on this alphabet.
For convenience I use the same characters here for the function and for the symbol that denotes the function: the symbol "$S$" represents the unary successor function $S(x)$. If I was being fastidious I could distinguish them. But I do put all strings of symbols into quotes to make sure there is no confusion: everything in quotes is a string of symbols, not a function.
Definition
Here is the inductive definition of "$\sigma$ is a definition of a primitive recursive function of $n$ variables", simultaneously for all $n$:
"S" is a definition of a primitive recursive function of one variable
"Z" is a definition of a primitive recursive function of one variable
For all $i \leq j$, "$\pi^j_i$" is a definition of a primitive recursive function of $j$ variables
If $\sigma_1, \ldots, \sigma_k$ are definitions of functions of the same number, $m$, variables and $\tau$ is a definition of a function of $k$ variables, then "$C(\tau)(\sigma_1)(\sigma_2)\ldots(\sigma_k)$" is a definition of a function of $m$ variables. The intended meaning is that this is the function $$ \tau(\sigma_1(x_1,\ldots,x_m),\sigma_2(x_1,\ldots,x_m),\ldots,\sigma_k(x_1,\ldots,x_m)) $$
If $\sigma$ is a definition of a function of $k$ variables and $\tau$ is a definition of a function of $k+2$ variables then "$R(\tau)(\sigma)$" is a definition of a function of $k+1$ variables. The intended meaning is that this is the function $$ f(x_1,\ldots,x_k,0) = \sigma(x_1,\ldots, x_k)\\ f(x_1, \ldots, x_k, y+1) = \tau(f(x_1,\ldots,x_k,y),y,x_1,\ldots,x_k) $$
The set of definitions of primitive recursive functions is the smallest set $\mathcal{D}$ of strings satisfying this inductive definition.
Example
Consider the function $g(x,y) = x + y$. This is defined by primitive recursion as $$ g(x,0) = x\\ g(x,y+1) = S(g(x,y)) = S(\pi^3_1(g(x,y),y,x) $$
The base function is $\sigma = \pi^1_1$. The recursion function is the composition of $\pi^3_1$ into $S$: $\tau = S(\pi^1_3(z,y,x))$. So one primitive recursive definition for this function, in the conventions above, is the string $$ \text{"}R(C(S)(\pi^3_1))(\pi^1_1)" $$
Semantics
You can prove by induction that each string in $\mathcal{D}$ does define a primitive recursive function using the intended semantics described above. Conversely, every primitive recursive function is defined by at least one string in $\mathcal{D}$ (in fact, the same function will be defined by infinitely many strings in $\mathcal{D}$).
Also, the set $\mathcal{D}$ is decidable, and each string in $\mathcal{D}$ can be decoded in only one way as a definition of a primitive recursive function.
Comments
The idea of viewing $C$ and $R$ as term building operations is common in $\lambda$ calculus and type theory. In that context, $R$ is often called a "recursor". An alternative way of axiomatizing PRA is to avoid including a different function symbol for every primitive recursive function. Instead we include the basic function terms $S$, $Z$, and $\pi^j_i$, and the term building operations $C$ and $R$, along with the necessary axioms for all of these.
We can also think of these definitions as a specification of a programming language for primitive recursive functions. The definitions give a way to define each primitive recursive function, and it is completely feasible (I have done it) to write a compiler that will take a suitable definition as input and evaluate the function on given inputs. This gives a toy example of a functional programming language. Interestingly, there is also a procedural language, called LOOP, that is able to compute exactly the primitive recursive functions.
There's a nice treatment in an old book that should be any decent library (also very cheaply available as a Dover reprint), Joel W. Robbin’s Mathematical Logic: A First Course (W. A Benjamin, 1969, Dover reprint 2006: pp. 212)
Robbin defines the primitive recursive functions in the usual way -- $f$ is primitive recursive if there is a finite sequence of functions $f_0, f_1, f_2, \ldots f_n = f$ where each $f_i$ is either an initial function or defined from earlier functions in the sequence by primitive recursion or composition); and then he defines a language which has a function expression for each p.r. function $f$. The key idea is to have a complex function expression built up to reflect a full definition of $f$ by primitive recursion and/or composition ultimately in terms of the initial functions. So; we have primitive expressions for each initial function, and two forms of constructing new expressions from old corresponding to definitions by primitive recursion and composition.
The result is that, in your words, in the formal language we can construct a formal expression that fully encapsulates any informal primitive recursive definition. And any expression constructed in the described way will indeed express a proper definitions by primitive recursion (and there be no constructible "non-valid" definitions, just as you want).
Robbin then introduces a system of arithmetic which he calls RA which has axioms for the logic plus axioms governing the expressions for the initial functions, and then there are axioms for dealing with complex functional expressions in terms of their constituents. RA ends up stronger than PRA since Robbin allows stronger induction, but for the purposes of understanding how you get function expressions for each primitive recursive function, his treatment is exemplary, as far as I recall.