Beautiful Soup [Python] and the extracting of text in a table

Use "text" to get text between "td"

1) First read table DOM using tag or ID

soup = BeautifulSoup(self.driver.page_source, "html.parser")
htnm_migration_table = soup.find("table", {'id':'htnm_migration_table'})

2) Read tbody

tbody = htnm_migration_table.find('tbody')

3) Read all tr from tbody tag

trs = tbody.find_all('tr')

4) get all tds using tr

for tr in trs:
      tds = tr.find_all('td')
      for td in tds:
      print(td.text)

First find the table (as you are doing). Using find rather than findall returns the first item in the list (rather than returning a list of all finds - in which case we'd have to add an extra [0] to take the first element of the list):

table = soup.find('table' ,attrs={'class':'bp_ergebnis_tab_info'})

Then use find again to find the first td:

first_td = table.find('td')

Then use renderContents() to extract the textual contents:

text = first_td.renderContents()

... and the job is done (though you may also want to use strip() to remove leading and trailing spaces:

trimmed_text = text.strip()

This should give:

>>> print trimmed_text
This is a sample text
>>>

as desired.

Beautiful Soup [Python] and the extracting of text in a table

Tags:

Python

Php

Related

Recent Posts