Fastest way to calculate matrix of pairwise distances
Using Outer
is here one of the worst methods, and not just because it computes the distance twice, but because you can't leverage vectorization in this approach. This is actually a common issue and an important point to stress: Outer
works pairwise and is unable to utilize the possible vectorized nature of the operation it is performing on an element-by-element basis.
Here is the code I will adopt from this answer:
distances=
With[{tr=Transpose[pts]},
Function[point,Sqrt[Total[(point-tr)^2]]]/@pts
];//AbsoluteTiming
(* {0.046875,Null} *)
which is an order of magnitude faster. You can Compile
it with a C target which may improve the performance further. Also, essentially the same approach I used in this recent answer, with good performance.
For Manhattan distance, use
distances =
With[{tr = Transpose[pts]},
Function[point, Total@Abs[(point - tr)]] /@ pts];
EDIT
As noted by Ray Koopman in comments, the function DistanceMatrix
from the package HierarchicalClustering`
may be faster for Euclidean distance, for small and medium data size (up to a couple of thousands):
Sqrt[HierarchicalClustering`DistanceMatrix[pts, DistanceFunction -> EuclideanDistance]];// AbsoluteTiming
(* {0.019351, Null} *)
Note, however, that this is only true for the particular case of Euclidean distance, or perhaps other distances which don't require to set the In recent versions of Mathematica it is optimized for several possible DistanceFunction
option explicitly on the top-level. In other cases (for example, for Manhattan distance), it will be quite slow, because when DistanceFunction
is set explicitly, one can not leverage vectorization any more, once again.DistanceFunction
settings, including ManhattanDistance
.
Here is a little procedural implementation using Bag
, compiled to C
:
distmatrix = Compile[{{pts, _Real, 2}},
Block[{x, y, list = Internal`Bag[Most[{0.}]]},
For[x = 1, x <= Length[pts], x++,
For[y = x + 1, y <= Length[pts], y++,
Internal`StuffBag[list,
Abs[Compile`GetElement[pts, x, 1] -
Compile`GetElement[pts, y, 1]] +
Abs[Compile`GetElement[pts, x, 2] -
Compile`GetElement[pts, y, 2]]];
];
];
Internal`BagPart[list, All]
], CompilationTarget -> "C", RuntimeOptions -> "Speed"];
distmatrix[pts]; // AbsoluteTiming
(*
{0.009000, Null}
*)
edit: an even better performing solution is based on Mr. Wizard's vectorization approach and relying on the listability and parallelizability of compiled functions, and as a nice touch, it doesn't rely on undocumented functions.
distmatrix2 =
Compile[{{point, _Real, 1}, {tr, _Real, 2}},
Total @ Abs[point - tr], CompilationTarget -> "C",
RuntimeOptions -> "Speed", RuntimeAttributes -> {Listable},
Parallelization -> True];
For comparison against Leonid's method, let's use more points.
pts = RandomVariate[NormalDistribution[], {10000, 2}];
distmatrix[pts]; // AbsoluteTiming
distances =
With[{tr = Transpose[pts]},
Function[point, Total@Abs[(point - tr)]] /@ pts]; // AbsoluteTiming
distmatrix2[pts, Transpose[pts]]; // AbsoluteTiming
(* {0.865050, Null}, {1.562089, Null}, {0.319018, Null} *)
It seems that the simple procedural implementation is a bit less than twice as fast, and not really worth the extra work/complexity. The listable/parallelized compiled solution is simpler and about 5x faster.
The DistanceMatrix
function, newly introduced in version 10.3, is very fast for Euclidean distances.
Here's a speed comparison with Leonid's fast solution.
pts = RandomReal[1, {5000, 2}];
Euclidean
dm1 = With[{tr = Transpose[pts]}, Function[point, Sqrt[Total[(point - tr)^2]]] /@ pts]; // AbsoluteTiming
(* {0.952627, Null} *)
dm2 = DistanceMatrix[pts, pts]; // AbsoluteTiming
(* {0.212496, Null} *)
dm3 = HierarchicalClustering`DistanceMatrix[pts, DistanceFunction -> EuclideanDistance]; // AbsoluteTiming
(* {0.582991, Null} *)
dm1 == dm2 == dm3
(* True *)
Note that HierarchicalClustering`DistanceMatrix
is a built-in function, not one provided by the package. Most performance-critical functions of this "package" are in reality highly optimized built-ins. Also note that the default distance for this function is not EuclideanDistance
, but that square of that. So we needed to specify EuclideanDistance
explicitly.
Manhattan
Let's test if these functions are special-cased for the Manhattan distance.
dm1 =
With[{tr = Transpose[pts]},
Function[point, Total@Abs[(point - tr)]] /@ pts]; // AbsoluteTiming
(* {0.801177, Null} *)
dm2 =
DistanceMatrix[pts, pts,
DistanceFunction -> ManhattanDistance]; // AbsoluteTiming
(* {0.211771, Null} *)
dm3 =
HierarchicalClustering`DistanceMatrix[pts,
DistanceFunction -> ManhattanDistance]; // AbsoluteTiming
(* {0.621123, Null} *)
dm1 == dm2 == dm3
(* True *)
A final note
For accurate benchmarking is it very important to use AbsoluteTiming
and not Timing
here. In recent versions of Mathematica all of these operations are internally parallelized and Timing
would measure the total CPU time spent by each core, added up, instead of the wall time.
Just for fun, here's a C++ version using LTemplate. This is specialized for 2D points!
<< LTemplate`
I'm on a Mac where the system compiler doesn't support OpenMP. I'll use gcc from MacPorts to be able to use OpenMP.
$CCompiler = {"Compiler" -> CCompilerDriver`GenericCCompiler`GenericCCompiler,
"CompilerInstallation" -> "/opt/local/bin",
"CompilerName" -> "g++-mp-5",
"SystemCompileOptions" ->
"-O3 -m64 -fPIC -framework Foundation -framework mathlink"};
SetDirectory[$TemporaryDirectory];
code = "
#include <cmath>
inline double sqr(double x) { return x*x; }
struct DistMatrix {
mma::RealTensorRef distMat(mma::RealMatrixRef a, mma::RealMatrixRef b) {
mma::RealMatrixRef mat = mma::makeMatrix<double>(a.rows(), b.rows());
#pragma omp parallel for
for (mint i=0; i < a.rows(); ++i)
for (mint j=0; j < b.rows(); ++j)
mat(i, j) = std::hypot(a(i,0) - b(j,0), a(i,1) - b(j,1));
return mat;
}
};
";
Export["DistMatrix.h", code, "String"];
tem = LClass[
"DistMatrix",
{LFun["distMat", {{Real, 2, "Constant"}, {Real, 2, "Constant"}}, {Real, 2}]}
];
CompileTemplate[tem,
"CompileOptions" -> {"-std=c++14", "-fopenmp"}]
LoadTemplate[tem]
obj = Make["DistMatrix"];
dm4 = obj@"distMat"[pts, pts]; // AbsoluteTiming
(* {0.062397, Null} *)
dm1 == dm4
(* True *)