What does netloc mean?

From RFC 1808, Section 2.1, every URL should follow a specific format:

<scheme>://<netloc>/<path>;<params>?<query>#<fragment>

Lets break this format down syntactically:

  • scheme: The protocol name, usually http/https
  • netloc: Contains the network location - which includes the domain itself (and subdomain if present), the port number, along with an optional credentials in form of username:password. Together it may take form of username:[email protected]:80.
  • path: Contains information on how the specified resource needs to be accessed.
  • params: Element which adds fine tuning to path. (optional)
  • query: Another element adding fine grained access to the path in consideration. (optional)
  • fragment: Contains bits of information of the resource being accessed within the path. (optional)

Lets take a very simple example to understand the above clearly:

https://cat.example/list;meow?breed=siberian#pawsize

In the above example:

  • https is the scheme (first element of a URL)
  • cat.example is the netloc (sits between the scheme and path)
  • /list is the path (between the netloc and params)
  • meow is the param (sits between path and query)
  • breed=siberian is the query (between the fragment and params)
  • pawsize is the fragment (last element of a URL)

This can be replicated programmatically using Python's urllib.parse.urlparse:

>>> import urllib.parse
>>> url ='https://cat.example/list;meow?breed=siberian#pawsize'
>>> urllib.parse.urlparse(url)
ParseResult(scheme='https', netloc='cat.example', path='/list', params='meow', query='breed=siberian', fragment='pawsize')

Now coming to your code, the if statement checks whether or not the next_page exists and whether the next_page has a netloc. In that login() function, checking if .netloc != '', means that it is checking whether the result of url_parse(next_page) is a relative URL. A relative URL has a path but no hostname (and thus no netloc).


import urllib.parse
url="https://example.com/something?a=1&b=1"
o = urllib.parse.urlsplit(url)
print(o.netloc)

example.com