Extract title with BeautifulSoup
You can directly use "soup.title" instead of "soup.find_all('title', limit=1)" or "soup.find('title')" and it'll give you the title.
from urllib import request
url = "http://www.bbc.co.uk/news/election-us-2016-35791008"
html = request.urlopen(url).read().decode('utf8')
html[:60]
from bs4 import BeautifulSoup
soup = BeautifulSoup(html, 'html.parser')
title = soup.title
print(title)
print(title.string)
To navigate the soup, you need a BeautifulSoup object, not a string. So remove your get_text()
call to the soup.
Moreover, you can replace raw.find_all('title', limit=1)
with find('title')
which is equivalent.
Try this :
from urllib import request
url = "http://www.bbc.co.uk/news/election-us-2016-35791008"
html = request.urlopen(url).read().decode('utf8')
html[:60]
from bs4 import BeautifulSoup
soup = BeautifulSoup(html, 'html.parser')
title = soup.find('title')
print(title) # Prints the tag
print(title.string) # Prints the tag string content