object of type 'generator' has no len()
Constructing strings by concatenating values separated by a separator is best done by str.join
:
def gen_bigrams(text):
token = nltk.word_tokenize(text)
bigrams = nltk.ngrams(token, 2)
# instead of " ".join also "{} {}".format would work in the map
return "-->".join(map(" ".join, bigrams))
Note that there'll be no trailing "-->", so add that, if it's necessary. This way you don't even have to think about the length of the iterable you're using. In general in python that is almost always the case. If you want to iterate through an iterable, use for x in iterable:
. If you do need the indexes, use enumerate
:
for i, x in enumerate(iterable):
...
bigrams is a generator function and bigrams.next() is what gives you the tuple of your tokens. You can do len() on bigrams.next() but not on the generator function. Following is more sophisticated code to do what you are trying to achieve.
>>> review = "i am feeling sad and disappointed due to errors"
>>> token = nltk.word_tokenize(review)
>>> bigrams = nltk.ngrams(token, 2)
>>> output = ""
>>> try:
... while True:
... temp = bigrams.next()
... output += "%s %s-->" % (temp[0], temp[1])
... except StopIteration:
... pass
...
>>> output
'i am-->am feeling-->feeling sad-->sad and-->and disappointed-->disappointed due-->due to-->to errors-->'
>>>