Complex integration by shifting the contour
A simple reference problem
Suppose we want to analyse the problem of a forced harmonic oscillator. Denote as $\phi(t)$ the time dependent position of the oscillator. The oscillator experiences two forces, the spring force $-k\phi(t)$ and an external force $F_{\text{ext}}(t)$. Newton's law says
$$ \begin{align} F(t) &= m a(t) \\ -k \phi(t) + F_{\text{ext}}(t) &= m \ddot{\phi}(t) \\ F_{\text{ext}}(t)/m &= \ddot{\phi}(t) + (k/m) \phi(t) \\ j(t) &= \ddot{\phi}(t) + \omega_0^2 \phi(t) \tag 1 \end{align} $$
where $\omega_0$ is the free oscillation frequency and $j(t)\equiv F_{\text{ext}}/m$. We use the following Fourier transform convention: $$ \begin{align} f(t) &= \int_\omega \tilde{f}(\omega) e^{i\omega t} \frac{\mathrm d\omega}{2\pi} \\ \tilde{f}(\omega) &= \int_t f(t) e^{-i\omega t}~\mathrm dt . \end{align} $$
With this convention on Eq. $(1)$, and defining $$\omega_{\pm} \equiv \pm \omega_0,$$ we find $$ \tilde{\phi}(\omega) = \frac{\tilde{j}(\omega)}{\omega_0^2-\omega^2} = \frac{-\tilde{j}(\omega)}{(\omega-\omega_+)(\omega-\omega_-)}. \tag 2 $$ From Eq. $(2)$ we see that the Green's function is $$\tilde{G}(\omega) = \frac{-1}{(\omega-\omega_+)(\omega-\omega_-)}$$ which has poles on the real axis. If we want to compute $\phi(t)$ we do a Fourier transform
$$ \phi(t) = \int_\omega \frac{-\tilde{j}(\omega)e^{i\omega t}}{(\omega-\omega_+)(\omega-\omega_-)} \frac{\mathrm d\omega}{2\pi} = \int_\omega e^{i\omega t}\tilde{j}(\omega) \tilde{G}(\omega)\frac{\mathrm d\omega}{2 \pi}. \tag{*} $$
This integral is tricky because of the poles on the axis. The solution everyone knows is to push the poles off the axis by adding an imaginary part to $\omega_{\pm}$, or by moving the contour above or below the real axis, but what does this actually mean physically? How do we choose which direction to push the poles or move the contour?
Damping to the rescue
In a real system, we always have some damping. In our oscillator model, this could come in the form of a velocity dependent friction $F_{\text{friction}} = -\mu \dot{\phi}(t)$. Defining $2\beta = \mu/m$, the equation of motion becomes
$$\ddot{\phi}(t) + 2\beta \dot{\phi}(t) + \omega_0^2\phi(t) = j(t) . \tag 3$$
Fourier transforming everything again leads to Eq. $(2)$ but now with \begin{equation} \omega_{\pm} = \pm \omega_0' + i\beta \end{equation} where \begin{equation} \omega_0' = \omega_0\sqrt{1-(\beta/\omega_0)^2}. \end{equation}
Therefore, we see that adding damping moves the poles a bit toward the origin along the real axis, but also gives them a positive imaginary component. In the limit of small damping (i.e. $\beta \ll \omega_0$), we find $\omega_0' \approx \omega_0$. In other words, the frequency shift of the poles due to the damping is small. So let's ignore that and focus on the added imaginary part.
Ok, suppose we want to do the integral $(*)$ in the case that $j(t)$ is a delta function at $t=0$. In that case, $\tilde{j}=1$ (I'm ignoring units) and we have $$ \phi(t) = \int_\omega \frac{e^{i\omega t}}{(\omega-\omega_+)(\omega-\omega_-)} \frac{\mathrm d\omega}{2\pi} $$ As you noted, for $t<0$ you have to close the contour in the lower plane in order to use Jordan's lemma. There aren't any poles in the lower half plane, so we get $\phi(t<0)=0$. This makes complete sense: the driving force is a delta function at $t=0$ and there shouldn't be any response of the system before the driving happens. This means that our introduction of friction imposed a causal boundary condition to the system! For $t>0$, you close in the upper half plane where there are poles, and so you get some response out of the integral.
Damping as a tool
In many cases, you don't naturally have damping in the system. For example, the Green's function from the question, $$\int^{\infty}_{-\infty}~\mathrm dk_0 \frac{e^{−ik_0 z_0}}{k_0^2 − \kappa^2}$$ doesn't have any damping and thus the poles sit on the real axis. So what you do is just bump the contour a bit up or down, or equivalently add $\pm i \beta$ to the poles (most people write $i \epsilon$ instead of $i \beta$), then do the integral, and finally take $\beta \rightarrow 0$. In doing this, you're solving the problem in the presence of damping (or anti-damping), and then taking the damping to zero in the end to recover the no-damping case.
Choosing to push the contour up or down, or equivalently choosing the sign of $\pm i \beta$, corresponds to imposing either friction or anti-friction, causal or anti-causal boundary conditions. If you pick the "causal" boundary condition, you find that the response of the system to a delta function in time and space is an outgoing spherical wave which starts at the delta function source. This gives you the so-called "retarded Green's function". If you pick the other condition, you find that the solution for a point source is actually an incoming spherical wave which converges right onto the point of the source. This gives you the so-called "Advanced Green's function".
The thing is, you can solve a problem using either Green's function. You're "allowed" to push the contour up or down (or add $+i\beta$ or $-i\beta$ to the poles) because you invented that as a trick to do the integral; it's not representing a real factor in your physical system. Of course, in problems where there is damping, the choice is made for you. When you have damping, you can't have fields at infinity; they'd be damped away by the time they interact with your sources.
I hope this was helpful, and I really hope if someone finds mistakes they'll jump in and fix 'em.
I think this question has been basically answered. There is a lot of mathematical stuff above (contours, etc.) but the initial question posted was a physics question.
Briefly: you have a differential equation describing some physical problem. You find the solution. In this case it is an integral that is "funny" and requires some choices to be made to make it unique and meaningful. The physics questions is why the choices are made and what do they mean?
Remember, a differential equation by itself is not a full description of physical reality. Solving the differential equation gives you all potential solutions to a problem which is nice but sort of too much. Differential equations always come hand in hand with boundary conditions. The boundary conditions are equally important. They tell you which solution is the meaningful one for your problem.
In this case, as explained above, the boundary condition is retarded versus advanced solution. When you bang on the system at t=0 (with delta function forcing), does the system response go forwards in time or backwards? I.e., was it doing nothing prior to t=0 and then does stuff (retarded)? Or was it doing stuff for t<0 and then you banged it just right to make it stop for t>0 and do nothing (advanced)? The final prescription (Feynman one of moving one pole up and one pole down) is called "time-ordered": you make a new kind of boundary condition that glues the retarded and advanced solutions together in a particular way at t=0.
As explained above, there is no "right" answer or "right" way of doing the integral. It depends on the boundary conditions. Without boundary conditions, all ways are OK.
If you think this boundary condition stuff is just for this case and nothing very common, please reconsider. For example, we all know intuitively that when you talk about eigenstates of some Hamiltonian in physics, they are normalized. That gives us meaning and allows us to compute probabilities. Also that for a finite system the energies are discretized. But mathematically it is totally not like this: if you just solve the Schrodinger equation as a mathematical entity, you will get solutions at all energies and associated eigenfunctions. Only those special ones with just the right energies are normalizable. Here the boundary condition is the wave function is normalizable (goes to zero fast enough at infinity). Without it, you can't calculate anything or make any real predictions of probability sine you can't figure out what is what or normalize things. So boundary conditions are pretty important and basic!
As written, the integral simply doesn't exist in the Riemann sense: there's a singularity at $k_0=\pm\kappa$. So really we're looking at the Cauchy Principle part, which is defined by:
$$\int_{-\infty}^\infty dk_0 \frac{e^{-ik_0z_0}}{k_0^2-\kappa^2}=\lim_{\epsilon\rightarrow 0}\left(\int_{-\infty}^{-\kappa-\epsilon}+\int_{-\kappa+\epsilon}^{\kappa-\epsilon}+ \int_{\kappa+\epsilon}^{\infty}\right)dk_0\frac{e^{-ik_0z_0}}{k_0^2-\kappa^2}.$$
For fixed $\epsilon$, each integrand is nice and continuous (near $\epsilon$ of the real axis), so if we integrate on a contour that's slightly above the real axis, we'll get something very close to the original integral, in terms of $\epsilon$. If you now control the error in terms of epsilon, you can make the whole argument rigorous and subsequently take $\epsilon\rightarrow 0$.