Avoiding numerical cancellation question for $\sin x -\sin y$ for $x \approx y$
It really depends on how exactly $x$ and $y$ are given. Frequently what we really have is not $x$ and $y$ but $x$ and $y-x$. We can then gain precision if we can analytically write the subtraction of nearly equal numbers exclusively in terms of $y-x$, like you did here, because we are already given $y-x$ accurately (more accurately than we would get if we computed it directly).
There are actually some standard library functions out there specialized to this exact purpose, for example "log1p", which is used to compute $\log(1+x)$ for small $x$.
One possibility: To determine the sine you use the Maclaurin series, and the faster this converges the fewer ill-comditioned operations you need to perform for the factor $\sin(\frac{x-y}{2})$. The small argument in that factor gets you there.
While a function $f : A \to B$ is a triple, consisting of a domain $A$, a codomain $B$ and a rule which assigns to each element $x \in A$ exactly one element $f(x) \in B$, too many focus exclusively on rule and forget to carefully specify the domain and the codomain.
In this case the function in question is $f : \mathcal F \times \mathcal F \rightarrow \mathbb R$, where $$f(x,y) = \sin(x) - \sin(y),$$ and $\mathcal F$ is the set of machine numbers, say, double precision floating point numbers. I will explain below why $\mathcal F$ this is the correct domain.
The problem of computing a difference $d = a - b$ between two real numbers $a$ and $b$ is ill-conditioned when $a \approx b$. Indeed if $$\hat{a} = a(1+\delta_a)$$ and $$\hat{b} = b(1 + \delta_b)$$ are the best available approximations of $a$ and $b$, then we cannot hope to compute a better approximation of $d$ than $$\hat{d} = \hat{a} - \hat{b}.$$ The relative error $$r = \frac{d - \hat{d}}{d},$$ satisfies the bound $$ |r| \leq \frac{|a| + |b|}{|a-b|} \max\{|\delta_a|,|\delta_b|\}.$$ When $a \approx b$, we cannot guarantee that the difference $d$ is computed with a small relative error. In practice, the relative error is large. We say that the subtraction magnifies the error committed when replacing $a$ with $\hat{a}$ and $b$ with $\hat{b}$.
In your situation $a = \sin(x)$ and $b = \sin(y)$. Errors are committed when computing the sine function. No matter how skilled we are, the best we can hope for is to obtain the floating point representation of $a$, i.e. $\text{fl}(a) = \sin(x)(1 + \delta)$, where $|\delta| \leq u$ and $u$ is the unit roundoff. Why? The computer may well have extra wide registers for internal use, but eventually, the result has to be rounded to, say, double precision, so that the result can be stored in memory. It follows, that if we compute $f$ using the definition and $x \approx y$, then the computed result will have relative error which is many times the unit roundoff.
In order to avoid the offending subtraction, we turn to the function $g : \mathcal F \times \mathcal F \to \mathbb R$ given by $$ g(x,y) = 2 \cos \left( \frac{x+y}{2} \right) \sin \left(\frac{x-y}{2} \right)$$ In absence of rounding errors $f(x,y) = g(x,y)$, but in floating point arithmetic they behave quite differently. The subtraction of two floating point numbers $x$ and $y$ is perfectly safe. In fact, if $y/2 \leq x \leq 2y$, then subtraction is done with one guard digit, then $x-y$ is computed exactly.
We are not entirely in the clear, as $x + y$ need not be a floating point number, but is computed with a relative error bounded by the unit roundoff. In the event that $$(x+y)/2 \approx (\frac{1}{2} + k) \pi$$ where $k \in \mathbb Z$, the calculation of $g$ suffers from the fact that cosine is ill-conditioned near a root.
Using a conditional to pick the correct expressions allows us to cover a larger subset of the domain.
In general, why $\mathcal F$ rather than $\mathbb R$? Consider the simpler problem of computing $f : \mathbb R \rightarrow \mathbb R$. In general, you do not know the exact value of $x$, and the best you can hope for is $\hat{x}$, the floating point representation of $x$. The impact of this error is controlled by the condition number of $f$. There is nothing you can do about large condition numbers, except switch to better hardware or simulate a smaller unit roundoff $u'$. This leaves you with the task of computing $f(\hat{x})$, where $\hat{x} \in \mathcal F$ is a machine number. That is why $\mathcal F$ is the natural domain during this the second stage of designing an algorithm for computing approximations of $f : \mathbb R \to \mathbb R$.