What is the correct relativistic distribution function?
I think you make it sound much more mysterious than it is. The relativistic distribution function is $$ f_p = \frac{1}{(2\pi)^3}\exp(-(\mu-u\cdot p)/T)\, $$ where $u_\alpha$ is the 4-velocity of the fluid, $p_\alpha$ is the 4-momentum of the particle, $T$ is temperature, and $\mu$ is the chemical potential. This is sometimes called the Juttner distribution, and it has obvious generalizations to Bose-Einstein and Fermi-Dirac statistics.
This expression is obviously Lorentz-invariant ($u\cdot p$ is a scalar, and so are $\mu$ and $T$), it reduces to Boltzmann in the rest frame of the fluid, and it is Gallilean invariant for slowly moving fluids.
The funny Bessel functions appear if you try to determine the fugacity $e^{\mu/T}$ in terms of the density, $n=N_0$ with $$ N_\mu = \int \frac{d^3p}{p^0} p_\mu f_p\, , $$ because now your normalization factor contains integrals over $\exp(-\sqrt{p^2+m^2}/T)$
Additional Remarks: After some prodding by the OP I looked at the Treumann et al. paper (it is available on the arxive, http://arxiv.org/abs/1105.2120). I initially thought that this was just a needlessly complicated rederivation of well-known results, but this is not the case. The paper is just wrong. (Frankly, it is not a good sign if a paper that points out a major flaw in relativistic kinetic theory is published in the physics section of the arxive, and a geophysics journal.)
The paper starts out o.k., observing that since $d^3xd^3p$ is Lorentz invariant, $f_p$ must be a scalar. However, they write down a distribution function which is clearly not a scalar unless I assume that $T$ is the zeroth component of a vector. Even if that could be arranged, the result is nor right because it is not Galilean invariant for small velocities. He then writes down a non-covariant expression for $N_\mu$. If $f_p$ is a scalar, $d^3p p_\mu f_p$ is not a vector, and as a result his particle density $N_0$ does not transform as a density (I wrote the correct expression above). Since $N_0$ is wrong, the normalization $\Lambda$ is also wrong.
More Remarks: After more prodding, an attempt to answer the original questions:
1) The original Boltzmann distribution is anisotropic if the fluid velocity $\vec{u}$ is not zero (the distribution is isotropic in the fluid rest frame), but presumably you are looking for something more general. The distribution function of a viscous fluid is anisotropic already in the rest frame of the fluid, $f_p=f_p^0 +\delta f_p$ with $$\delta f_p \sim \eta\sigma_{\alpha\beta}p^\alpha p^\beta,$$ where $\sigma_{\alpha\beta}$ is the relativistic strain tensor. Even more generally, we can parameterize the distribution function in terms of the currents and the stresses. This is the relativistic version of Grad's 13 moment method, see, for example, http://arxiv.org/abs/1301.2912. People have also written down models for highly anisotropic distribution functions, see for example http://arxiv.org/abs/1007.0130 .
2) The temperature is a scalar. It is related to the kinetic energy per particle in the rest frame. The kinetic energy density is the 00 component of a tensor, and transforms accordingly.
E. Lehmann (J. of Math. Physics 47 023303, 2006) attempted this in his paper Covariant equilibrium statistical mechanics (the paper is also in arXiv). He found that his fully covariant distribution became flatter than the non-covariant one. Also that while the non-covariant partition function continued to increase with temperature the fully covariant one reached a maximum at about logT~12.
Even though there is loss of particle count invariance in the fully covariant case one can still define an energy density and with that a temperature in the classical sense assuming thermodynamic equilibrium. However as the temperature increases we will reach conditions that prevailed in the universe at the time of the big bang and I doubt if anyone will claim that can be an equilibrium situation. Physical continuity would suggest that the effects would come gradually. The temperature at Lehmann's maximum occurs in core collapse supernovae.
It is possible to reproduce the principal results of Lehmann's analysis (maximum temperature and flattening distribution) using the simple argument Maxwell first used in the classical case when he was still a teenager. That is we split the 3D distribution function into a product of three identical functions and use the composition of velocities to get the form of the distribution. The classical composition is easily replaced by the relativistic using the Lorentz transformation. The resulting distribution becomes flat at logT~11 after which it becomes U-shaped. There are actually two underlying assumptions in Maxwell's approach: molecular chaos and that an equilibrium distribution must be isotropic. Isotropy might be a necessary condition but this result suggests that it is not sufficient.