Tanimoto coefficient distance measure
The Tanimoto similarity coefficient (which is not a true distance measure) is defined by
d(x,y) = x.y / ((|x|*|x|) + (|y|*|y|)- x.y)
for bit vectors x and y.
Now compare that with the cosine similarity coefficent,
d(x,y) = x.y / (|x| * |y|)
The denominators differ by a x.y
term. The Tanimoto and cosine similarity coefficients would be the same if x.y
is zero.
Geometrically, x.y
is zero if and only if x
and y
are perpendicular.
Since x
and y
are bit vectors (i.e. whose values in each dimension can only be 0 or 1), x.y
equalling zero means
x1*y1 + x2*y2 + ... + xn*yn = 0
If xi*yi = 1*1 = 1, then the whole sum would be positive. For the whole sum to be zero, no term xi*yi can equal 1. They must all equal 0:
So
x1*y1 = 0
x2*y2 = 0
...
xn*yn = 0
In other words, if xi is 1, then yi must be 0, and vice versa.
So there are tons of examples where the Tanimoto similarity is equal to the cosine similarity:
x = (0,1,0,1)
y = (1,0,0,0)
for instance.