Measure similarity between two documents using Doc2Vec
Hello just In case someone is interested, to do this you just need the cosine distance between the two vectors.
I found that most people are using 'spatial' for this pourpose
Here is a small code sniped that should work pretty well if you already have trained doc2vec
from gensim.models import doc2vec
from scipy import spatial
d2v_model = doc2vec.Doc2Vec.load(model_file)
fisrt_text = '..'
second_text = '..'
vec1 = d2v_model.infer_vector(fisrt_text.split())
vec2 = d2v_model.infer_vector(second_text.split())
cos_distance = spatial.distance.cosine(vec1, vec2)
# cos_distance indicates how much the two texts differ from each other:
# higher values mean more distant (i.e. different) texts