when is the maximum likelihood estimator measurable

It's an interesting question. Usually, a discussion on measurability of estimators is (deliberately) avoided in statistical applications...

In the following I assume the i.i.d. case; $(X,\mathcal{F},P)$ is a probability space, and $\Theta$ is a parameter space. The MLE is a special case of M-estimator $\hat\theta_n=\hat\theta_n(X_1,...,X_n)$ which is defined by

$$\tag{1}\label{1} P_n h(\cdot,\hat\theta_n)=\inf_{\theta\in\Theta}P_nh(\cdot,\theta) $$

for some measurable function $h:X\times\Theta\rightarrow \mathbb{R}$; ($P_n$ is the empirical measure/sample average). In the parametric likelihood case $h(x,\theta)=-\ln f_\theta(x)$.

The main issue with measurability of $\hat\theta_n$ is the measurability of the infimum in \eqref{1}. In particular, if $\Theta$ is compact and $h$ is continuous in $\theta$ for each $x\in X$, then there is enough structure to ensure that the relevant infimum is measurable (compactness implies separability so that we can consider the infimum over a countable dense subset of $\Theta$).

I found a more general discussion on the issue in "High Dimensional Probability, Vol. 1" on pages 34-58. First, let $\hat\theta_n^*$ denote an approximate M-estimator. A sequence of approximate estimators satisfies $$P_nh(\cdot,\hat\theta_n^{*})-\inf_{\theta\in\Theta}P_nh(\cdot,\theta)\to 0$$ in (outer) probability (or a.s.). Imposing some structure on $(\Theta,\mathcal{S})$ (measurable space associated with $\Theta$) it can be shown that although a Borel-measurable approximate estimator need not exist, this structure ensures the existence of universally measurable (u.m.) estimator.

Here is Theorem A.2 on page 55 (which uses a version of the measurable selection theorem):

If $(X,\mathcal{F})$ is any measurable space and $(\Theta,\mathcal{S})$ is Suslin, and if a sequence of approximate M-estimators exists, then such estimators can be chosen to be u.m.

You can find even more discussion on measurability in chapter 7 of "Stochastic Optimal Control: The Discrete-Time Case" (it's available online).