How to get word2index from gensim
The mappings from word-to-index are in the KeyedVectors
vocab
property, a dictionary with objects that include an index
property.
For example:
word = "whatever" # for any word in model
i = model.vocab[word].index
model.index2word[i] == word # will be true
Even simpler solution would be to enumerate index2word
word2index = {token: token_index for token_index, token in enumerate(w2v.index2word)}
word2index['hi'] == 30308 # True