AttributeError: 'int' object has no attribute 'lower' in TFIDF and CountVectorizer
As you see the error is AttributeError: 'int' object has no attribute 'lower'
which means integer cannot be lower-cased. Somewhere in your code, it tries to lower case integer object which is not possible.
Why this happens?
CountVectorizer
constructor has parameter lowercase
which is True by default. When you call .fit_transform()
it tries to lower case your input that contains an integer. More specifically, in your input data, you have an item which is an integer object. E.g., your list contains data similar to:
corpus = ['sentence1', 'sentence 2', 12930, 'sentence 100']
When you pass the above list to CountVectorizer
it throws such exception.
How to fix it?
Here are some possible solution to avoid this problem:
1) Convert all rows in your corpus to string object.
corpus = ['sentence1', 'sentence 2', 12930, 'sentence 100']
corpus = [str (item) for item in corpus]
2) Remove integers in your corpus:
corpus = ['sentence1', 'sentence 2', 12930, 'sentence 100']
corpus = [item for item in corpus if not isinstance(item, int)]