Who first proved that we can prove that we prove things we prove?
The theorem $\Box P \Rightarrow \Box\Box P$ is due to Martin Löb and first appears in his 1955 paper "Solution of a Problem of Leon Henkin", J. Symb. Logic 20 115–118: it appears as condition (V) (page 116) in the paper in question, and whereas conditions (I)–(IV) are referred there to the earlier (1939) book by Hilbert and Bernays, Grundlagen der Mathematik, condition (V) (although easily deduced from the others) is new.
The reasoning "that a strong enough theory of arithmetic proves every true $\Sigma_1$ sentence" is exactly the one which Löb uses in his proof (if we grant that "$\exists x.(f(x)=0)$" for a recursive $f$ qualifies as "every $\Sigma_1$ sentence").
As evidence that Löb was the first to state this fact, I offer the following quote from G. Boolos in The Logic of Provability (1995), chapter 2: "Hilbert and Bernays had listed three somewhat ungainly conditions […]. The isolation of (the attractive) (i), (ii), and (iii) [essentially as on Wikipedia] is due to Löb." As well as the entry "Provability Logic" from the Stanford Encyclopedia of Philosophy: "In the same paper, Löb formulated three conditions on the provability predicate of Peano Arithmetic, that form a useful modification of the complicated conditions that Hilbert and Bernays introduced in 1939 for their proof of Gödel's second incompleteness theorem [again the same conditions as on Wikipedia]".
Both results are due to Feferman. They appear as Lemma 3.10 and Corollary 5.5 of Arithmetization of metamathematics in a general setting, Fundamenta Mathematicae 49:35-92, 1960.