How to improve performance of BesselJ to the level of GSL?
Following the advice in comments, I've made a test library for BesselJ[1, #] &
function to evaluate via GSL. I still consider it a workaround, so if you find a way to use Mathematica built-in functions with good performance, please do make a new answer.
Needs["CCompilerDriver`"]
besselJ1src = "
#include \"WolframLibrary.h\"
DLLEXPORT mint WolframLibrary_getVersion() {return WolframLibraryVersion;}
DLLEXPORT int WolframLibrary_initialize(WolframLibraryData libData) {return 0;}
DLLEXPORT void WolframLibrary_uninitialize(WolframLibraryData libData) {}
#include <gsl/gsl_sf_bessel.h>
DLLEXPORT int j1(WolframLibraryData libData, mint argc, MArgument* args, MArgument result)
{
MArgument_setReal(result,gsl_sf_bessel_J1(MArgument_getReal(args[0])));
return LIBRARY_NO_ERROR;
}
";
besselJ1lib = CreateLibrary[besselJ1src, "besselj1",
"Libraries" -> {"gsl", "gslcblas"}];
j1 = LibraryFunctionLoad[besselJ1lib, "j1", {Real}, Real];
Now I can execute my original code with j1
instead of BesselJ[1, #] &
:
zs = N /@ Range[0, 12, 10^-5];
AbsoluteTiming[bessels = j1 /@ zs;]
{0.241519, Null}
And bessels
do indeed have numerical values.
I present in this answer a compiled implementation of one of the simpler algorithms for numerically evaluating a Bessel function of (modestly-sized) integer order and (small to medium-sized) real argument. This uses Miller's algorithm:
bessj = With[{bjl = N[Log[1*^16]]},
Compile[{{n, _Integer}, {x, _Real}},
Module[{h, hb, k, l, m, r, s},
If[x == 0., Return[If[n == 0, 1., 0.]]];
l = r = 1;
While[r = 2 l;
r (Log[r] - Log[Abs[x/2]] - 1) + Log[6 r + 1]/2 < bjl,
l = r];
While[r - l > 1, m = Quotient[l + r, 2];
If[m (Log[m] - Log[Abs[x/2]] - 1) + Log[6 m + 1]/2 < bjl,
l = m, r = m]];
h = 0.; s = 0.; hb = Internal`Bag[Most[{0.}]];
Do[h = x/(2 k - h x); Internal`StuffBag[hb, h];
s = h (s + 2 Mod[k + 1, 2]), {k, l, 1, -1}];
Internal`StuffBag[hb,
If[n >= 0 || Mod[n, 2] == 0, 1, -1]/(1 + s)];
Times @@ Take[Internal`BagPart[hb, All], -Abs[n] - 1]],
RuntimeAttributes -> {Listable}]];
Compare:
zs = N[Range[0, 12, 1*^-5]];
AbsoluteTiming[BesselJ[1, #] & /@ zs;]
{76.729389, Null}
AbsoluteTiming[bessj[1, #] & /@ zs;]
{39.575264, Null}
However, the listability of both the built-in function and the compiled version are not being exploited here, so here's another test:
AbsoluteTiming[j1 = BesselJ[1, zs];]
{74.120239, Null}
AbsoluteTiming[j1t = bessj[1, zs];]
{24.816419, Null}
Max[Abs[j1t - j1]]
2.94459*10^-13
Note that I did not do any compilation to C in this case; doing this ought to result in a function with even better performance.
If need be, you can use the trapezoidal rule to evaluate a Bessel function of low integer order and small to moderately sized argument. That should be easy to do within a compiled function, if speed is of the essence.
I mentioned this in a comment months ago. Here is the long-overdue follow through. What follows is a simple routine for $J_1(x)$ for real, moderately sized $x$, using the midpoint rule (since, as it turns out, the expressions look simpler with it; the performance is similar, however):
bj1 = With[{bjl = N[Log[1*^16]], p2 = N[π/2]},
Compile[{{x, _Real}},
Module[{m = 1, k, r, s},
If[x != 0., While[(8 m - 2) (Log[4 m - 1] -
Log[Abs[0.5 x]] - 1) + Log[25 m - 5] < 2 bjl, m++]];
s = 0.;
Do[r = Sin[(k + 0.5) p2/(m + 1)];
s += r Sin[r x],
{k, 0, m}];
s/(m + 1)],
RuntimeAttributes -> {Listable}]];
Admittedly, it does not have the generality of my previous answer, but it is concise and quite efficient:
AbsoluteTiming[j1 = BesselJ[1, zs];]
{191.703, Null}
AbsoluteTiming[j1t = bj1[zs];]
{37.6003, Null}
Max[Abs[j1t - j1]]
4.44089*10^-16
For an explanation of why the trapezoidal/midpoint rule works so well, see this or this.