Getting all Links from a page Beautiful Soup
Replace your last line:
links = soup.find_all('a')
By that line :
links = [a.get('href') for a in soup.find_all('a', href=True)]
It will scrap all the a
tags, and for each a
tags, it will append the href
attribute to the links list.
If you want to know more about the for loop between the []
, read about List comprehensions.
You are telling the find_all
method to find href
tags, not attributes.
You need to find the <a>
tags, they're used to represent link elements.
links = soup.find_all('a')
Later you can access their href
attributes like this:
link = links[0] # get the first link in the entire page
url = link['href'] # get value of the href attribute
url = link.get('href') # or like this