How to derive the posterior predictive distribution?

To show this one can follow a somewhat standard argument. In what follows, for notational convenience, I have replaced your "$D$"s with "$S$"s. By the law of total expectation (in terms of conditional expectation) and Fubini's theorem, applied to any bounded measurable function $f$ defined on the relevant sample space $\Omega$, we observe that $$ \eqalign{ \int_{\Omega}f(s^{'})p(s^{'}\mid s)\mathrm ds^{'}&=\mathbb E[f(S^{'})\mid S=s]=\mathbb E[E[f(S^{'})\mid \Theta\,,s]\mid S=s]\\&=\int_{\theta}\left(\int_{\Omega}f(s^{'})p(s^{'}\mid \theta\,,s) \mathrm ds^{'}\right)p(\theta\mid s)\mathrm d\theta \\&= \int_{\Omega}f(s^{'})\left(\int_{\theta}p(s^{'}\mid \theta\,,s)p(\theta\mid s)\mathrm d\theta\right) \mathrm ds^{'}} $$

Since the far l.h.s. is equal to the far r.h.s. for all bounded measurable functions, we conclude that $$ p(s^{'}\mid s)=\int_{\theta}p(s^{'}\mid \theta\,,s)p(\theta\mid s)\mathrm d\theta $$


$p(D',\theta | D) = p(D' | \theta,D)p(\theta | D)$ is from Bayes rules, provided we have densities:

$p(D',\theta | D) = \frac{P(D', \theta, D)}{P(D)} = \frac{P(D'|\theta, D) P(\theta, D)}{P(D)} = P(D'|\theta, D) P(\theta | D)$.

Now integrate out the nuisance variable $\theta$ on both sides. Your formula also appears to have a Markov-type assumption $p(D'|\theta,D)=p(D'|\theta)$.