Detecting whether or not text is English (in bulk)
I read a method to detect English language by using Trigrams
You can go over the text, and try to detect the most used trigrams in the words. If the most used ones match with the most used among english words, the text may be written in English
Try to look in this ruby project:
https://github.com/feedbackmine/language_detector
EDIT: This won't work in this case, since OP is processing text in bulk which is against Google's TOS.
Use the Google Translate language detect API. Python example from the docs:
url = ('https://ajax.googleapis.com/ajax/services/language/detect?' +
'v=1.0&q=Hola,%20mi%20amigo!&key=INSERT-YOUR-KEY&userip=INSERT-USER-IP')
request = urllib2.Request(url, None, {'Referer': /* Enter the URL of your site here */})
response = urllib2.urlopen(request)
results = simplejson.load(response)
if results['responseData']['language'] == 'en':
print 'English detected'