Why are degenerate states more likely to be filled at a given temperature?
It's worth considering the high temperature limit, where $k_{\textrm{B}}T \gg \epsilon$ (or $\beta \epsilon \ll 1$). In this limit, all of the states have effectively the same energy, and since they have the same energy, they have the same probability of occurring. Thus, if $g$ is the degeneracy of the excited state, then the probability of being in any one of the states is $1/(g+1)$. This means that the probability of the system having energy 0 is exactly $1/(g+1)$, and the probability of the system having energy $\epsilon$ is $g/(g+1)$.
Thus, in making the transition from low temperature$-$where only the ground state is populated$-$to high temperature$-$where all states are populated equally$-$the probability for being in any one of the excited states must increase.
Just to make sure this is clear, let me briefly explain why all of the states must be equally likely in the high temperature limit. The (or a) physical context of statistical mechanics is one where we have a system in thermal and/or mechanical contact with a thermal reservoir (a much larger system) of temperature $T$. Roughly speaking, the average energy of a particle in the reservoir is $k_{\textrm{B}}T$.
We can imagine a collision between a particle in the reservoir with a particle in the system:
- If $k_{\textrm{B}}T \ll \epsilon$, then the reservoir particle doesn't have enough energy to bump the system particle up into an excited state, and so at low temperatures, the system stays mostly in the ground state on average.
- If $k_{\textrm{B}}T \gg \epsilon$, however, then the reservoir particle has much more energy than either level of the system, and so if it collides with a system particle, it can either change the state or not: the energetics basically allow either case, and so the net result is that it's equally likely that a system particle has either energy.