Audit url open for permitted schemes. Allowing use of "file:" or custom schemes is often unexpected
Because I stumbled upon this question and the accepted answer did not work for me, I researched this myself:
Why urlib is a security risk
urlib not only opens http:// or https:// URLs, but also ftp:// and file://. With this it might be possible to open local files on the executing machine which might be a security risk if the URL to open can be manipulated by an external user.
How to fix this
You are yourself responsible to validate the URL before opening it with urllib. E.g.
if url.lower().startswith('http'):
req = urllib.request.Request(url)
else:
raise ValueError from None
with urllib.request.urlopen(req) as resp:
[...]
How to fix this so the linter (e.g. bandit) does no longer complain
At least bandit has a simple blacklist for the function call. As long as you use urllib, the linter will raise a warning. Even if you DO validate your input like shown above. (Or even use hardcoded URLs).
Add a #nosec
comment to the line to suppress the warning from bandit or look up the suppression keyword for your linter/code-checker. It's best practice to also add additional comments stating WHY you think this is not worth a warning in your case.
I think this is what you need
import urllib.request
req = urllib.request.Request('http://www.example.com')
with urllib.request.urlopen(req) as response:
the_page = response.read()
For the people who couldn't solve it by above answers.
You could use requests
library instead, which is not black listed in bandit.
https://bandit.readthedocs.io/en/latest/blacklists/blacklist_calls.html#b310-urllib-urlopen
import requests
url = 'http://www.example.com'
the_page = requests.get(url)
print(the_page.json()) # if the response is json
print(the_page.text) # if the response is some text