Getting all Links from a page Beautiful Soup

Replace your last line:

links = soup.find_all('a')

By that line :

links = [a.get('href') for a in soup.find_all('a', href=True)]

It will scrap all the a tags, and for each a tags, it will append the href attribute to the links list.

If you want to know more about the for loop between the [], read about List comprehensions.

You are telling the find_all method to find href tags, not attributes.

You need to find the <a> tags, they're used to represent link elements.

links = soup.find_all('a')

Later you can access their href attributes like this:

link = links[0]          # get the first link in the entire page
url  = link['href']      # get value of the href attribute
url  = link.get('href')  # or like this

Getting all Links from a page Beautiful Soup

Tags:

Python

Web Scraping

Html Parsing

Beautifulsoup

Related

Recent Posts