What is the fastest way to compute sin and cos together?
Modern Intel/AMD processors have instruction FSINCOS
for calculating sine and cosine functions simultaneously. If you need strong optimization, perhaps you should use it.
Here is a small example: http://home.broadpark.no/~alein/fsincos.html
Here is another example (for MSVC): http://www.codeguru.com/forum/showthread.php?t=328669
Here is yet another example (with gcc): http://www.allegro.cc/forums/thread/588470
Hope one of them helps. (I didn't use this instruction myself, sorry.)
As they are supported on processor level, I expect them to be way much faster than table lookups.
Edit:
Wikipedia suggests that FSINCOS
was added at 387 processors, so you can hardly find a processor which doesn't support it.
Edit:
Intel's documentation states that FSINCOS
is just about 5 times slower than FDIV
(i.e., floating point division).
Edit:
Please note that not all modern compilers optimize calculation of sine and cosine into a call to FSINCOS
. In particular, my VS 2008 didn't do it that way.
Edit:
The first example link is dead, but there is still a version at the Wayback Machine.
Technically, you’d achieve this by using complex numbers and Euler’s Formula. Thus, something like (C++)
complex<double> res = exp(complex<double>(0, x));
// or equivalent
complex<double> res = polar<double>(1, x);
double sin_x = res.imag();
double cos_x = res.real();
should give you sine and cosine in one step. How this is done internally is a question of the compiler and library being used. It could (and might) well take longer to do it this way (just because Euler’s Formula is mostly used to compute the complex exp
using sin
and cos
– and not the other way round) but there might be some theoretical optimisation possible.
Edit
The headers in <complex>
for GNU C++ 4.2 are using explicit calculations of sin
and cos
inside polar
, so it doesn’t look too good for optimisations there unless the compiler does some magic (see the -ffast-math
and -mfpmath
switches as written in Chi’s answer).
Modern x86 processors have a fsincos instruction which will do exactly what you're asking - calculate sin and cos at the same time. A good optimizing compiler should detect code which calculates sin and cos for the same value and use the fsincos command to execute this.
It took some twiddling of compiler flags for this to work, but:
$ gcc --version
i686-apple-darwin9-gcc-4.0.1 (GCC) 4.0.1 (Apple Inc. build 5488)
Copyright (C) 2005 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
$ cat main.c
#include <math.h>
struct Sin_cos {double sin; double cos;};
struct Sin_cos fsincos(double val) {
struct Sin_cos r;
r.sin = sin(val);
r.cos = cos(val);
return r;
}
$ gcc -c -S -O3 -ffast-math -mfpmath=387 main.c -o main.s
$ cat main.s
.text
.align 4,0x90
.globl _fsincos
_fsincos:
pushl %ebp
movl %esp, %ebp
fldl 12(%ebp)
fsincos
movl 8(%ebp), %eax
fstpl 8(%eax)
fstpl (%eax)
leave
ret $4
.subsections_via_symbols
Tada, it uses the fsincos instruction!