spacy Can't find model 'en_core_web_sm' on windows 10 and Python 3.5.3 :: Anaconda custom (64-bit)
Initially I downloaded two en packages using following statements in anaconda prompt.
python -m spacy download en_core_web_lg
python -m spacy download en_core_web_sm
But, I kept on getting linkage error and finally running below command helped me to establish link and solved error.
python -m spacy download en
The answer to your misunderstanding is a Unix concept, softlinks which we could say that in Windows are similar to shortcuts. Let's explain this.
When you spacy download en
, spaCy tries to find the best small model that matches your spaCy distribution. The small model that I am talking about defaults to en_core_web_sm
which can be found in different variations which correspond to the different spaCy versions (for example spacy
, spacy-nightly
have en_core_web_sm
of different sizes).
When spaCy finds the best model for you, it downloads it and then links the name en
to the package it downloaded, e.g. en_core_web_sm
. That basically means that whenever you refer to en
you will be referring to en_core_web_sm
. In other words, en
after linking is not a "real" package, is just a name for en_core_web_sm
.
However, it doesn't work the other way. You can't refer directly to en_core_web_sm
because your system doesn't know you have it installed. When you did spacy download en
you basically did a pip install. So pip knows that you have a package named en
installed for your python distribution, but knows nothing about the package en_core_web_sm
. This package is just replacing package en
when you import it, which means that package en
is just a softlink to en_core_web_sm
.
Of course, you can directly download en_core_web_sm
, using the command: python -m spacy download en_core_web_sm
, or you can even link the name en
to other models as well. For example, you could do python -m spacy download en_core_web_lg
and then python -m spacy link en_core_web_lg en
. That would make
en
a name for en_core_web_lg
, which is a large spaCy model for the English language.
Hope it is clear now :)
The below worked for me :
import en_core_web_sm
nlp = en_core_web_sm.load()