Is there any way to get the vocabulary size from doc2vec model?
If model
is your trained Doc2Vec model, then the number of unique word tokens in the surviving vocabulary after applying your min_count
is available from:
len(model.wv.vocab)
The number of trained document tags is available from:
len(model.docvecs)
The return data type of vocab is a dictionary. Use keys() as follows:
model.wv.vocab.keys()
This should return a list of words.