How do you get the text from an HTML 'datacell' using BeautifulSoup
The BeautifulSoup documentation should cover everything you need - in this case it looks like you want to use findNext
:
headerRows[0][10].findNext('b').string
A more generic solution which doesn't rely on the <b>
tag would be to use the text argument to findAll
, which allows you to search only for NavigableString
objects:
>>> s = BeautifulSoup(u'<p>Test 1 <span>More</span> Test 2</p>')
>>> u''.join([s.string for s in s.findAll(text=True)])
u'Test 1 More Test 2'
headerRows[0][10].contents[0].find('b').string