Photons in expanding space: how is energy conserved?

Since you say you're talking about what happens locally (in a small volume), I'll answer from that point of view. The usual formulation of energy conservation in such a volume is that energy is conserved in an inertial reference frame. In general relativity, there are no truly inertial frames, but in a sufficiently small volume, there are reference frames that are approximately inertial to any desired level of precision. If you restrict your attention to such a frame, there is no cosmological redshift. The photon's energy when it enters one side of the frame is the same as the energy when it exits the other side. So there's no problem with energy conservation.

The (apparent) failure of energy conservation arises only when you consider volumes that are too large to be encompassed by a single inertial reference frame.

To be slightly more precise, in some small volume $V=L^3$ of a generic expanding Universe, imagine constructing the best possible approximation to an inertial reference frame. In that frame, observers near one edge will be moving with respect to observers near the other edge, at a speed given by Hubble's Law (to leading order in $L$). That is, in such a frame, the observed redshift is an ordinary Doppler shift, which causes no problems with energy conservation.

If you want more detail, David Hogg and I wrote about this at considerable (perhaps even excessive!) length in an AJP paper.