Could groups be used instead of sets as a foundation of mathematics?
The answer is yes, in fact one has a lot better than bi-interpretability, as shown by the corollary at the end. It follows by mixing the comments by Martin Brandenburg and mine (and a few additional details I found on MO). The key observation is the following:
Theorem: The category of co-group objects in the category of groups is equivalent to the category of sets.
(According to the nLab, this is due to Kan, from the paper "On monoids and their dual" Bol. Soc. Mat. Mexicana (2) 3 (1958), pp. 52-61, MR0111035)
Co-groups are easily defined in purely categorical terms (see Edit 2 below).
The equivalence of the theorem is given by free groups as follows: if $X$ is a set and $F_X$ is the free group on X then Hom$(F_X,H)=H^X$ is a group, functorially in H, hence $F_X$ has a cogroup object structure. As functions between sets induce re-indexing functions: $H^X \rightarrow H^Y$ that are indeed group morphisms, morphisms between sets indeed are cogroup morphisms.
Explicitly, $\mu:F_X \rightarrow F_X * F_X$ is the map that sends each generator $e_x$ to $e_x^L * e_x^R$, and $i$ is the map that sends each generators to its inverse.
An easy calculation shows that the generators are the only elements such that $\mu(y)=y^L*y^R$ and hence that any cogroup morphism comes from a function between sets. So the only co-group morphisms are the ones sending generators to generators.
And with a bit more work, as nicely explained on this other MO answer, one can check that any cogroup object is of this form.
Now, as all this is a theorem of $\sf{ETCS}$, it is a theorem of $\sf{ETCG}$ that all the axioms (and theorems) of $\sf{ETCS}$ are satisfied by the category of cogroup objects in any model of $\sf{ETCG}$, which gives you the desired bi-interpretability between $\sf{ETCS}$ and $\sf{ETCG}$. Adding supplementary axioms to $\sf{ETCS}$ (like R) does not change anything.
In fact, one has more than bi-interpretability: the two theories are equivalent in the sense that there is an equivalence between their models. But one has a lot better:
Corollary: Given $T$ a model of $\sf{ETCS}$, then $Grp(T)$ is a model of $\sf{ETCG}$. Given $A$ a model of $\sf{ETCG}$, then $CoGrp(A)$ is a model of $\sf{ETCS}$. Moreover these two constructions are inverse to each other up to equivalence of categories.
Edit: this an answer to a question of Matt F. in the comment to give explicit example of how axioms and theorems of $\sf{ECTS}$ translate into $\sf{ECTG}$.
So in $\sf{ECTS}$ there is a theorem (maybe an axioms) that given a monomorphism $S \rightarrow T$ there exists an object $R$ such that $T \simeq S \coprod R$.
In $\sf{ECTG}$ this can be translated as: given $T$ a cogroup object and $S \rightarrow T$ a cogroup monomorphism* then there exists a co-group $R$ such that $T \simeq S * R$ as co-groups**.
*: It is also a theorem of $\sf{ECTG}$ that a map between cogroup is a monomorphism of cogroup if and only if the underlying map of objects is a monomorphisms. Indeed that is something you can prove for the category of groups in $\sf{ECTS}$ so it holds in $\sf{ECTG}$ by definition.
** : We can prove in $\sf{ECTG}$ (either directly because this actually holds in any category, or proving it for group in $\sf{ECTS}$) that the coproduct of two co-group objects has a canonical co-group structure which makes it the coproduct in the category of co-groups.
Edit 2: To clarify that the category of cogroup is defined purely in the categorical language:
The coproduct in group is the free product $G * G$ and is definable by its usual universal property.
A cogroup is then an object (here a group) equipped with a map $\mu: G \rightarrow G * G$ which is co-associative, that is $\mu \circ (\mu * Id_G) = \mu \circ (Id_G * \mu)$, and counital (the co-unit has to be the unique map $G \rightarrow 1$), that is $(Id_G,0) \circ \mu = Id_G$ and $(0,Id_G) \circ \mu = Id_G$, where $(f,g)$ denotes the map $G * G \rightarrow G$ which is $f$ on the first component and $g$ on the other component, as well as an inverse map $i:G \rightarrow G$ such that $(Id_G ,i ) \circ \mu = 0 $. Morphisms of co-groups are the map $f:G \rightarrow H$ that are compatible with all these structures, so mostly such that $ (f * f) \circ \mu_H = \mu_G \circ f $.
If you have doubt related to the "choice" of the object $G * G$ (which is only defined up to unique isomorphisms) a way to lift them is to define "a co-group object" as a triple of object $G,G *G,G * G *G$ with appropriate map between them satisfying a bunch of confition (includings the universal property) and morphisms of co-group as triple of maps satisfying all the expected conditions. This gives an equivalent category.
There are a few bits of good news for an affirmative answer to this question.
Theorem 1) ZFC can be interpreted in Th(On), the first-order theory of ordinals. See Gaisi Takeuti, "Formalization of the Theory of Ordinals", JSL 1965. (https://projecteuclid.org/euclid.jsl/1183735178)
Theorem 2) There are abelian $p$-groups of every infinite ordinal length, where the length $\ell(G)$ of a group $G$ is the least ordinal $\sigma$ such that $p^\sigma G=0$. See Laszlo Fuchs, Infinite Abelian Groups, vol 2: p. 58 for the definition and p. 85 for the construction of these generalized Prufer groups.
Putting these together, I had hoped to encode the ordinals by such groups, and thus interpret Th(On) in ETCG, from which an interpretation of ZFC in ETCG would follow.
The bad news is that Takeuti's theory Th(On) is a theory in a large language, which starts with $a=b$, $a<b$, $(a,b)$ (ordered pair), and then goes on to include $+$, $\times$ and all primitive recursive functions of ordinals. So to interpret this Th(On) in ETCG, we would at a minimum need to find formulas $\phi_\le, \phi_{\wedge}$ in ETCG such that:
$\phi_\le (a,b)$ holds exactly when $\ell(a)\le\ell(b)$
$\phi_\wedge (a,b,c)$ holds exactly when $\ell(a)=\ell(b)^{\ell(c)}$
Perhaps $\phi_\le$ would be as simple as saying that there is a mono from $a$ to $b$. But finding $\phi_\wedge$ seems difficult.
Even finding a way of characterizing the generalized Prufer groups in the ETCG language seems difficult. Fortunately there is still one more bit of good news:
Claim 3) We can characterize the abelian groups in the language of ETCG.
$1$ is the unique terminal object in the category
a morphism is constant if it factors through $1$.
$G$ is almost free iff for every $H$ other than $1$, there is a non-constant map from $G$ to $H$.
$\mathbb{Z}$ is the unique almost free group with monos into all other almost free groups.
$G$ has two elements iff there are exactly two maps from $\mathbb{Z}$ to $G$.
$G$ has eight elements iff there are exactly eight maps from $\mathbb{Z}$ to $G$.
$H$ is a subgroup of $G$ iff there is a mono from $H$ to $G$.
$G/H=K$ iff there is a mono and an epi
$$H \hookrightarrow G \twoheadrightarrow K$$
whose composition is constant, and such that whenever the square commutes in the diagram below, there is a map from $\mathbb{Z}$ to $H$ making the triangle commute also:
$$\begin{array}{ccccc} & & \mathbb{Z} & \rightarrow & 1 \\ & \swarrow & \downarrow & & \downarrow \\ H & \hookrightarrow & G & \twoheadrightarrow & K\\ \end{array}$$ (The second condition is saying that the kernel of the epi is included in the range of the mono.)
$H$ is a normal subgroup of $G$ iff $G/H=K$ for some $K$.
$G$ is cyclic iff it is $\mathbb{Z}/H$ for some $H$.
$Q$ is the unique 8-element group which is not cyclic, but which has a two-element subgroup $S$ whose mono into $Q$ factors through any subgroup of $Q$ other than $1$.
$G$ is abelian iff all of its subgroups are normal, and $Q$ is not a subgroup of $G$.
I expect we can go further towards characterizing the generalized Prufer groups by characterizing reduced non-separable infinite abelian 2-groups in ETCG. Anyone who finds that easy would be in a better place than I am to complete the difficult but maybe not impossible plan above.