Is there any way to get the vocabulary size from doc2vec model?

If model is your trained Doc2Vec model, then the number of unique word tokens in the surviving vocabulary after applying your min_count is available from:

len(model.wv.vocab)

The number of trained document tags is available from:

len(model.docvecs)

The return data type of vocab is a dictionary. Use keys() as follows:

model.wv.vocab.keys()

This should return a list of words.

Is there any way to get the vocabulary size from doc2vec model?

Tags:

Gensim

Word2Vec

Doc2Vec

Related

Recent Posts