Isometry without injection and surjection

It is natural to assume $B_1$ and $B_2$ are both non-degenerate, which means if $B_1(x,y) = 0$ for all $y$ then $x = 0$ and if $B_1(x,y) = 0$ for all $x$ then $y = 0$, and likewise with $B_2$. This condition was suggested in a comment by wxu to countinghaus's answer, and it is a good one. For real vector spaces, a positive-definite bilinear form is non-degenerate (after all, if $B(x,y) = 0$ for all $y$, then just use $y = x$ to see $B(x,x) = 0$, so $x = 0$ by positive-definiteness). While positive-definite bilinear forms are non-degenerate, not by a long shot are "most" non-degenerate bilinear forms on a real vector space positive definite. In fact, physics gives us a great example: the bilinear form associated to $x_1^2 + x_2^2 + x_3^2 - x_4^2$ in (special) relativity is not positive definite but it is non-degenerate. Also, non-degeneracy is a condition on a bilinear form that makes sense when you're not working over the real or complex numbers (like those exotic fields of positive characteristic) and positive definiteness loses its meaning.

Let's assume $\sigma \colon V_1 \rightarrow V_2$ is a function between two vector spaces over a field $F$ and that $B_1(v,w) = B_2(\sigma(v),\sigma(w))$ for all $v$ and $w$ in $V_1$, where $B_i$ is a non-degenerate bilinear form on $V_i$. What can we say about $\sigma$?

1) It is injective. Indeed, if $\sigma(v) = \sigma(w)$, then for all $u$ in $V_1$ we have $$ B_1(v,u) = B_2(\sigma(v),\sigma(u)) = B_2(\sigma(w),\sigma(u)) = B_1(w,u),$$ so $B_1(v-w,u) = 0$. This holds for all $u$, so $v-w = 0$ by non-degeneracy of $B_1$. Thus $v = w$.

2) It preserves linear independence. Suppose $v_1,\dots,v_n$ are linearly independent in $V_1$. We will show $\sigma(v_1),\dots,\sigma(v_n)$ are linearly independent in $V_2$. Assume $\sum_{i=1}^n c_i\sigma(v_i) = 0$ in $V_2$, where $c_i \in F$. For all $u$ in $V_1$, $$ B_1\left(\sum_{i=1}^n c_iv_i,u\right) = \sum_{i=1}^n c_iB_1(v_i,u) = \sum_{i=1}^n c_iB_2(\sigma(v_i),\sigma(u)) = B_2\left(\sum_{i=1}^n c_i\sigma(v_i),\sigma(u)\right), $$ and the last term is $B_2(0,u) = 0$. Thus $B_1(\sum_{i=1}^n c_iv_i,u) = 0$ for all $u$ in $V_1$. By non-degeneracy of $B_1$, $\sum_{i=1}^n c_iv_i = 0$, so each $c_i$ is $0$ by linear independence of $v_1,\dots,v_n$.

Notice the non-degeneracy of $B_2$ hasn't been used so far.

3) If $V_1$ and $V_2$ are finite-dimensional and $\dim(V_1) = \dim(V_2)$ then $\sigma$ is linear, so necessarily surjective since linear + injective implies surjective when the vector spaces have the same (finite) dimension. In case linearity surprises you as a consequence, rather than a hypothesis, check that we didn't actually use linearity of $\sigma$ yet. It is a consequence of the isometry condition provided the bilinear forms are non-degenerate and the spaces have the same (finite) dimension. I will give two proofs that $\sigma$ is linear.

Proof #1. Pick a basis $e_1,\dots,e_n$ of $V_1$. Define $L \colon V_1 \rightarrow V_2$ to be the linear map where $L(e_i) = \sigma(e_i)$ for all $i$. That is, $$ L(c_1e_1+\cdots+c_ne_n) := c_1L(e_1) + \cdots + c_nL(e_n) $$ for all scalars $c_i$. Since $L$ is linear and $B_1$ and $B_2$ are bilinear, it follows from $B_2(L(e_i),L(e_j)) = B_1(e_i,e_j)$ for all $i$ and $j$ that $B_2(L(v),L(w)) = B_1(v,w)$ for all $v$ and $w$ in $V_1$. Then $L$ is injective by the same argument used in part 1 for $\sigma$. Since $L$ is 1-1 and linear between vector spaces of the same (finite) dimension, it is invertible and $B_1(L^{-1}(x),L^{-1}(y)) = B_1(x,y)$ for all $x$ and $y$ in $V_2$. Set $k = L^{-1} \circ \sigma \colon V_1 \rightarrow V_1$. Then $k(e_i) = e_i$ for all $i$ and $B_1(k(v),k(w)) = B_1(v,w)$ for all $v$ and $w$ in $V_1$.

For all $v$ in $V_1$, $B_1(k(v),e_i) = B_1(k(v),k(e_i)) = B_1(v,e_i)$, so the linearity of $B_1$ in its 2nd component implies $B_1(k(v),w) = B_1(v,w)$ for all $w$ in $V_1$. Then nondegeneracy of $B_1$ implies $k(v) = v$ for all $v$ in $V_1$. Thus $k$ is the identity, so $\sigma = L$. This proves $\sigma$ is an invertible linear map. (We didn't use the 2nd item here.)

Proof #2. Pick a basis $e_1,\dots,e_n$ of $V_1$. Then $\sigma(e_1),\dots,\sigma(e_n)$ is linearly independent in $V_2$ by the 2nd item above, so it is a basis of $V_2$. For any scalars $a_1,\dots,a_n$, set $v' = \sum_{i=1}^n a_i\sigma(e_i)$. (All elements of $V_2$ arise in this way.) We expect that $v' = \sigma(\sum_{i=1}^n a_ie_i)$. Write $v' - \sigma(\sum_{i=1}^n a_ie_i)$ as $\sum_{i=1}^n b_i\sigma(e_i)$ for some $b_i$ in $F$. For any $u$ in $V_1$, we will compute $B_2(\sum_{i=1}^n b_i\sigma(e_i),\sigma(u))$ in two ways.

First it is $$ \sum_{i=1}^n b_iB_2(\sigma(e_i),\sigma(u)) = \sum_{i=1}^n b_iB_1(e_i,u) = B_1\left(\sum_{i=1}^n b_ie_i,u\right). $$ Second, it is $$ B_2\left(\sum_{i=1}^n a_i\sigma(e_i)-\sigma\left(\sum_{i=1}^n a_ie_i\right),\sigma(u)\right) = \sum_{i=1}^na_iB_1(e_i,u) - B_1\left(\sum_{i=1}^n a_ie_i,u\right)=0. $$ Thus $B_1(\sum_{i=1}^n b_ie_i,u)=0$ for all $u$, so $\sum_{i=1}^n b_ie_i = 0$ in $V_1$ by non-degeneracy of $B_1$, and that implies each $b_i$ is 0. We have shown $$ \sum_{i=1}^n a_i\sigma(v_i) = \sigma\left(\sum_{i=1}^n a_iv_i\right) $$ for all $a_i$ in $F$. Thus $\sigma$ is linear.

The linearity of $\sigma$ is due to Andrew Vogt. See Theorem 2.4 in A. Vogt, "On the linearity of form isometries", SIAM Journal on Applied Mathematics 22 (1972), 553--560. The second proof above is Vogt's proof. (He also has a linearity proof in Lemma 1.5 if $B_2$ is non-degenerate but non-degeneracy is not assumed for $B_1$.)

To summarize, if $V_1$ and $V_2$ are finite-dimensional vector spaces over a field of the same dimension and they are equipped with non-degenerate bilinear forms $B_1$ and $B_2$, any function $\sigma \colon V_1 \rightarrow V_2$ such that $B_1(v,w) = B_2(\sigma(v),\sigma(w))$ for all $v$ and $w$ in $V_1$ must be an isomorphism of vector spaces (linear, injective, and surjective). The proof never used non-degeneracy of $B_2$ on $V_2$ in the argument, and that now actually follows from non-degeneracy of $B_1$ on $V_1$.

Some of the comments on this question have pointed out that if you don't assume the spaces have the same dimension then there is no reason to expect surjectivity, e.g., an embedding of one Euclidean space into another with larger dimension. Moreover, Vogt's linearity theorem breaks down too. Consider $\sigma \colon {\mathbf R}^n \rightarrow {\mathbf R}^{n+1,1}$ (here ${\mathbf R}^{p,q}$ is $(p+q)$-dimensional space with the quadratic form having signature $(p,q)$) by $$ \sigma(x_1,\dots,x_n) = (x_1,\dots,x_n,x_n^3,x_n^3). $$ This map $\sigma$ is not linear but it preserves the bilinear forms on ${\mathbf R}^n$ and ${\mathbf R}^{n+1,1}$.

If there were no theorem like Vogt's, nothing would change, since we want isometries of bilinear spaces to be linear maps anyway. For instance, if $(V,Q)$ is a non-degenerate quadratic space, you don't define an isometry of $(V,Q)$ to be any function $\sigma \colon V \rightarrow V$ satisfying $Q(\sigma(v)) = Q(v)$ for all $v \in V$, because such a function can be completely wild: think about ${\mathbf R}^3$ with its usual sum of squares quadratic form and let $\sigma$ be any function from ${\mathbf R}^3$ to itself that preserves all spheres around the origin and is a crazy mapping on each sphere. An isometry of quadratic spaces is defined to be a linear map preserving the quadratic forms.