In Godel's first incompleteness theorem, what is the appropriate notion of interpretation function?
I've always interpreted this notion in the following way. $ \def\eq{\leftrightarrow} \def\t{\text} \def\pa{\t{PA}} \def\th{\t{Th}} \def\prf{\t{Proof}} \def\prov{\t{Prov}} \def\box{\square} \def\nn{\mathbb{N}} \def\str#1{{``\text{#1}\!"}} $
Formal system interpretation
Take any formal systems $S,T$.
We say that $S$ interprets $T$ via $ι$ iff $ι$ is a computable translation from sentences over $T$ to sentences over $S$ such that all the following hold for any sentences $φ,ψ$ over $T$:
(Ι1) $S \nvdash ι(\bot)$.
(Ι2) If $S \vdash ι(φ)$ and $S \vdash ι(φ \to ψ)$ then $S \vdash ι(ψ)$.
(Ι3) If $T \vdash φ$ then $S \vdash ι(φ)$.
This definition automatically implies that any formal system that interprets any other formal system is consistent.
It is sufficiently general to include all kinds of formal systems, unlike other definitions I've seen such as the one given in Rautenberg's A concise Introduction to Mathematical Logic, which only makes sense for classical first-order theories.
The 1st incompleteness theorem (non-constructive)
I shall sketch a proof that any formal system that interprets PA (in fact Robinson's Q suffices) and has decidable proof validity can neither prove nor disprove (the translation of) some sentence over PA.
Take any formal system $S$ with proof verifier $V$ (meaning that for every strings $x,y$, we have that $V(x,y)$ outputs $\str1$ if $x$ is a proof of $y$ over $S$ and outputs $\str0$ otherwise) that interprets PA via $ι$.
Then for any sentence $φ$ over PA, we have $S \vdash ι( φ \to ( \neg φ \to \bot ) )$ by (I3), and hence [(I4)] either $S \nvdash ι(φ)$ or $S \nvdash ι(\neg φ)$, because otherwise $S \vdash ι(\bot)$ by (I2) twice, contradicting (I1). [This is the only place where we use (I1) and (I2).]
Let $u$ be a $1$-parameter sentence over PA such that, for any strings $x,y,z$, if program $x$ on input $y$ produces output $z$, then $\pa \vdash u(c(x,y,z))$ and $\pa \vdash \neg u(c(x,y,w))$ for every string $w \ne z$. Here "$c(x,y,z)$" denotes the term coding for $(x,y,z)$. [The existence of $u$ is really the only difficult part when it comes to PA or Q; it becomes trivial if $S$ can interpret string manipulation natively, since we will not have to go through Godel coding.]
Let $G$ be the following program on input $(P,X)$:
For each string $s$ in length-lexicographic order:
If $V(s,ι(u(c(P,X,\str0))))$ then output $\str0$.
If $V(s,ι(\neg u(c(P,X,\str0))))$ then output $\str1$.
[The idea is that $G(P,X)$ searches for a proof or disproof of "$P$ halts on $X$ and outputs $\str0$" and outputs $\str0$ if it finds a proof and $\str1$ if it finds a disproof.]
If ( $S \vdash ι(φ)$ or $S \vdash ι(\neg φ)$ ) for every sentence $φ$ over PA:
Given any program $P$ and input $X$:
Either $S \vdash ι(u(c(P,X,\str0)))$ or $S \vdash ι(\neg u(c(P,X,\str0)))$, and hence $G$ halts on $(P,X)$.
If $P$ halts on $X$ and outputs $\str0$:
$\pa \vdash u(c(P,X,\str0))$, and hence $S \vdash ι(u(c(P,X,\str0)))$ by (I3).
Thus $S \nvdash ι(\neg u(c(P,X,\str0)))$ by (I4), and hence $G(P,X) = \str0$.
If $P$ halts on $X$ and does not output $\str0$:
$\pa \vdash \neg u(c(P,X,\str0))$, and hence $S \vdash ι(\neg u(c(P,X,\str0)))$ by (I3).
Thus $S \nvdash ι(u(c(P,X,\str0)))$ by (I4), and hence $G(P,X) = \str1$.
[But this property of $G$ is impossible!]
Let $C$ be the following program on input $P$:
If $G(P,P) = 0$ then output $\str1$ otherwise output $\str0$.
Then $C$ halts on $C$ because $G$ halts on $(P,P)$.
Thus $C(C) = \str0$ iff $G(C,C) = \str0$ [by property of $G$] iff $C(C) = \str1$ [by definition of $C$].
Contradiction.
Therefore ( $S \nvdash ι(φ)$ and $S \nvdash ι(\neg φ)$ ) for some sentence $φ$ over PA.
By the way, the core idea of this proof came from this blog post, but I changed the construction to make it cleaner and self-contained (using only basic programming knowledge). Interestingly, a commenter on that blog post claims to prove that the problem that the $G$ in the above proof solves is strictly weaker than the halting problem, which 'corresponds' nicely to the fact that using the unsolvability of the halting problem only yields Godel's version of the incompleteness theorem, which requires $Σ_1$-soundness of $S$.
The 1st incompleteness theorem (constructive)
This was technically not asked for in the question, but I am including it because it is related and also because it gives an explicit sentence that witnesses the first incompleteness theorem, unlike the above elegant but non-constructive proof. The proof simply translates Rosser's trick appropriately, and as before one can see that Q suffices in place of PA.
Take any formal system $S$ as before, and let $V,ι,u,c$ be as previously defined, and let $c'$ be the inverse of $c$.
Let $\prf_S$ be the $2$-parameter sentence $( m,n \mapsto u(c(V,(c'(m),ι(c'(n))),\str1)) )$ over PA.
Then, for any string $x$ and sentence $φ$ over PA, we have that $\pa \vdash \prf_S(c(x),c(φ))$ if $x$ is a proof of $ι(φ)$ over $S$ and $\pa \vdash \neg \prf_S(c(x),c(φ))$ otherwise.
For each sentence $φ$ over PA, let $\prov_S φ = \exists m\ ( \prf_S(m,c(φ)) \land \forall n<m\ ( \neg \prf_S(n,c(\neg φ)) )$. [Intuitively, "$\prov_S φ$" is intended to assert that there is a proof of $ι(φ)$ over $S$ and no smaller proof of $ι(\neg φ)$ over $S$.]
By the fixed point theorem for PA, let $φ$ be a sentence over PA such that $\pa \vdash φ \eq \neg \prov_S φ$.
If $S \vdash ι(φ)$:
$\pa \vdash \prov_S φ \equiv \neg φ$, and hence $S \vdash ι(\neg φ)$ by (I3).
Also $S \nvdash ι(\neg φ)$ by (I4), and hence a contradiction.
If $S \vdash ι(\neg φ)$:
Let $p$ be the proof of $ι(\neg φ)$ over $S$.
Then $\pa \vdash \prf_S(c(p),c(\neg φ))$, and hence $\pa \vdash \forall m > c(p)\ ( \exists n<m\ ( \prf(n,c(\neg φ)) ) )$.
Also $S \nvdash ι(φ)$ by (I4), and hence $\pa \vdash \forall m \le c(p)\ ( \neg \prf_S(m,c(φ)) )$.
Thus $\pa \vdash \neg \prov_S φ \equiv φ$, and hence $S \vdash ι(φ)$ by (I3), which gives a contradiction.
Therefore $S \nvdash ι(φ)$ and $S \nvdash ι(\neg φ)$.
Provability logic and the 2nd incompleteness theorem
We can get more if we further require (I2) and (I3) to be witnessed by computable functions. Precisely:
Take any formal systems $S,T$.
We say that $S$ uniformly interprets $T$ via $ι,f,g$ iff $S$ interprets $T$ via $ι$ and $f,g$ are partial computable functions such that, for any sentences $φ,ψ$ over PA, the following hold:
(I2) For any proofs $x,y$ of $ι(φ)$ and $ι(φ \to ψ)$ respectively over $S$, we have that $f(x,y)$ is a proof of $ι(ψ)$ over $S$.
(I3) For any proof $x$ of $φ$ over PA, we have that $g(x)$ is a proof of $ι(φ)$ over $S$.
Then any formal system $S$ that has decidable proof validity and uniformly interprets PA cannot prove 'its own consistency'. Note that the proof I shall give does not apply to formal systems that merely uniformly interpret Q.
Take any formal system $S$ that has decidable proof validity and uniformly interprets PA via $ι,f,g$, and let $\prf$ be as previously defined.
For each sentence $φ$ over PA, let $\box_S φ = \exists m\ ( \prf_S(m,c(φ)) )$, and let $\t{Con}(S) = \neg \box_S \bot$.
Then it is not hard to check that $S$ satisfies the provability conditions and the fixed point theorem, in the sense that the following hold:
(D1) If $S \vdash ι(φ)$ then $S \vdash ι( \box_S φ )$, for any sentence $φ$ over PA.
(D2) $S \vdash ι( \box_S φ \land \box_S( φ \to ψ ) \to \box_S ψ )$, for any sentences $φ,ψ$ over PA.
(D3) $S \vdash ι( \box_S φ \to \box_S \box_S φ )$, for any sentence $φ$ over PA.
(F) Given any $1$-propositional-parameter sentence $P$ over PA such that every occurrence of the parameter in $P$ is bound by some $\box_S$, there is a sentence $Q$ over PA so that $S \vdash ι( Q \eq P(Q) )$.
Basically this is because PA can capture $f,g$, and hence manipulate proofs over $S$ of arithmetical sentences (of the form $ι(φ)$ for some sentence $φ$ over PA). $S$ interprets PA and hence can do the same.
Thus by Lob's theorem as proven in provability logic we get Godel's second incompleteness theorem for $S$ in both external and internal form:
(GI*) $S \nvdash ι( \t{Con}(S) )$.
(GI) $S \vdash ι( \t{Con}(S) \to \neg \box_S \t{Con}(S) )$.
Note that although (I1) only requires that $S$ does not prove the (translation of the) arithmetical sentence "$\bot$", we have shown that in fact $S$ also does not prove the arithmetical sentence $\t{Con}(S)$, which is just a $Π_1$-sentence (because $u$ needs only bounded quantifiers and hence also $\prf$) such that every instantiation of that universal sentence can be proven by $S$! Therefore $S$ is $ω$-incomplete in the same essential way that PA is.
The relevant notion of "capable of expressing elementary arithmetic" could be formalized with something like:
There's a primitive recursive family of numeral predicates $\psi_0(x), \psi_1(x), \ldots, \psi_n(x), \ldots$ such that $ T\vdash\exists! x.\psi_i(x)$ for each $i$ and $T\vdash \neg\psi_i(x)\lor \neg\psi_j(x)$ for $i\ne j$.
For each primitive recursive function $f$ of $k$ variables, there's a formula $\phi(x_1,\ldots,x_k,y)$ such that $$ T\vdash \psi_{n_1}(x_1) \land \cdots \land \psi_{n_k}(x_k) \to \bigl(\phi(x_1,\ldots,x_k,y)\leftrightarrow \psi_{f(n_1,\ldots,n_k)}(y)\bigr) $$ for all $n_1,\ldots,n_k$.
possibly with some additional technical restrictions (e.g., the $\omega$-consistency requirement in Gödel's original work would correspond to some additional conditions on what can be proved about the $\psi_i$s).