urllib2 HTTP Error 400: Bad Request
you have to use urllib.quote
You need to use urllib.quote()
on your 'query' variable:
query = urllib.quote(query)
host = 'http://www.bing.com/search?q=%s&go=&qs=n&sk=&sc=8-13&first=%s' % (query, page)
This does the necessary URL escaping to convert the space in big dog
to big%20dog
.
The reason that "the dog" returns a 400 Error is because you aren't escaping the string for a URL.
If you do this:
import urllib, urllib2
quoted_query = urllib.quote(query)
host = 'http://www.bing.com/search?q=%s&go=&qs=n&sk=&sc=8-13&first=%s' % (quoted_query, page)
req = urllib2.Request(host)
req.add_header('User-Agent', User_Agent)
response = urllib2.urlopen(req)
It will work.
However I highly suggest you use requests instead of using urllib/urllib2/httplib. It's much much easier and it'll handle all of this for you.
This is the same code with python requests:
import requests
results = requests.get("http://www.bing.com/search",
params={'q': query, 'first': page},
headers={'User-Agent': user_agent})