beautifulsoup get data inner value of a paragraph tag based on id code example
Example 1: find element in beautifulsoup by partial attribute value
# Use regex
# suppose I have list of 'div' with 'class' as follows:
# <div class='abcd bcde cdef efgh'>some content</div>
# <div class='mnop bcde cdef efgh'>some content</div>
# <div class='abcd pqrs cdef efgh'>some content</div>
# <div class='hijk wxyz cdef efgh'>some content</div>
# as observable the class value string of above div(s) ends with 'cdef efgh'
# So to extract all these in a single list:
from bs4 import BeautifulSoup
import re # library for regex in python
soup = BeautifulSoup(<your_html_response>, <parser_you_want_to_use>)
elements = soup.find_all('div', {'class': re.compile(r'cdef efgh$')}) # $ means that 'cdef efgh' must is the ending of the string
# Note: This was just one case. You can make almost any case with regex.
# Learn more and experiment with regex at https://regex101.com/
Example 2: scrape text from specific p tag
from bs4 import BeautifulSoup
import urllib
url = urllib.urlopen('http://meinparlament.diepresse.com/')
content = url.read()
soup = BeautifulSoup(content, 'lxml')
table = soup.findAll('div',attrs={"class":"content-question"})
for x in table:
print x.find('p').text
# Another way to retrieve tables:
# table = soup.select('div[class="content-question"]')