measurability of supremum of a class of functions
Lack of measurability of $g$ is a commonly known issue in stochastic optimal control, so this question have been studied quite extensively. The theory is the most rich in case $X$ and $Y$ are not general metric spaces, but are homeomorhpic to Borel subsets of complete separable metric spaces. One usually says that $X,Y$ are (standard) Borel spaces.
Note that $g$ is Borel-measurable on $X$ iff $\{x:g(x)>c\}$ is Borel-measurable for any real $c$, but $$ \{x:g(x)>c\} = \pi_X\{(x,y):f(x,y)>c\} \tag{1} $$ where $\pi_X$ is a projection map onto $X$. Although Lebesgue believed differently, projections of Borel sets may fail to Borel sets (discovered by Souslin and Luzin). As a result, although the example of Nate uses $Y$ as not a Borel space, we could construct a similar example for a Borel $Y$. Take $A$ be any Borel subset of $X\times Y$ such that $A:=\pi_X(B)$ is not Borel, then measurable $f = 1_B$ gives us non-measurable $g = 1_A$.
At the same time, projection of Borel subset of a Borel space is always an analytic set. These sets are in fact often defined as images of arbitrary Borel sets under Borel maps. Analytic subsets of Borel spaces are universally measurable: that is for any Borel probability measure $p$ on $X$, any analytic set $A$ is $p$-measurable, so we can define $p(A)$ unambiguously.
Following $(1)$, let's say that $f$ is upper semianalytic whenever $\{(x,y):f(x,y)>c\}$ is analytic for all $c\in \Bbb R$. Then by $(1)$ we obtain that $g$ is also upper semianalytic, so this class of functions is closed under taking such suprema. Clearly, every Borel function is upper semianalytic, but not vice-versa - e.g. $g$ from the second paragraph.
If you are still interested just in the Borel measurability, you need some continuity assumptions. For example, if $f$ is lower semicontinuous, then so is $g$, and if $f$ is upper semicontinuous and then so is $g$ if $Y$ is compact. Semicontinuous functions are Borel measurable, of course. I think you may want to take a look at Section 7.5 of the Stochastic Optimal Control book by Bertsekas and Shreve, freely available at MIT.
FWIW, this situation is very much related to the following one. A map $\phi:X\to Y$ between two Borel spaces is Borel measurable iff its graph is Borel measurable. However, for some Borel $B\subseteq X\times Y$ there may not be Borel $\phi$ whose graph is contained in $B$ over $\pi_X(B)$. There still does exist a universally measurable $\phi$ with such property, even if $B$ just analytic, not necessarily Borel. For a Borel $\phi$ to exist, certin continuity assumptions are needed such as compactness of $B$.
Nope. Take, say, $X = [0,1]$ and $Y$ your favorite non-measurable subset of $[0,1]$. Let $f(x,y) = 1$ if $x=y$ and $0$ otherwise. Then $g = 1_Y$ which is not a measurable function on $X$.