How to extract meta description from urls using python?
Please check BeautifulSoup as solution.
For question above, you may use the following code to extract "description" info:
import requests
from bs4 import BeautifulSoup
url = 'http://www.virginaustralia.com/au/en/bookings/flights/make-a-booking/'
response = requests.get(url)
soup = BeautifulSoup(response.text)
metas = soup.find_all('meta')
print [ meta.attrs['content'] for meta in metas if 'name' in meta.attrs and meta.attrs['name'] == 'description' ]
output:
['Search for and book Virgin Australia and partner flights to Australian and international destinations.']
do you know html xpath? use lxml lib with xpath to extract html element is one fast way.
import lxml
doc = lxml.html.document_fromstring(html_content)
title_element = doc.xpath("//title")
website_title = title_element[0].text_content().strip()
meta_description_element = doc.xpath("//meta[@property='description']")
website_meta_description = meta_description_element[0].text_content().strip()