Validate a filename in python
now there is a full library to validate strings: check it out:
from pathvalidate import sanitize_filepath
fpath = "fi:l*e/p\"a?t>h|.t<xt"
print("{} -> {}".format(fpath, sanitize_filepath(fpath)))
fpath = "\0_a*b:c<d>e%f/(g)h+i_0.txt"
print("{} -> {}".format(fpath, sanitize_filepath(fpath)))
output:
fi:l*e/p"a?t>h|.t<xt -> file/path.txt
_a*b:c<d>e%f/(g)h+i_0.txt -> _abcde%f/(g)h+i_0.txt
Armin Ronacher has a blog post on this subject (and others).
These ideas are implemented as the safe_join() function in Flask:
def safe_join(directory, filename):
"""Safely join `directory` and `filename`.
Example usage::
@app.route('/wiki/<path:filename>')
def wiki_page(filename):
filename = safe_join(app.config['WIKI_FOLDER'], filename)
with open(filename, 'rb') as fd:
content = fd.read() # Read and process the file content...
:param directory: the base directory.
:param filename: the untrusted filename relative to that directory.
:raises: :class:`~werkzeug.exceptions.NotFound` if the resulting path
would fall out of `directory`.
"""
filename = posixpath.normpath(filename)
for sep in _os_alt_seps:
if sep in filename:
raise NotFound()
if os.path.isabs(filename) or filename.startswith('../'):
raise NotFound()
return os.path.join(directory, filename)
You can enforce the user to create a file/directory inside wiki by normalizing the path with os.path.normpath and then checking if the path begins with say '(path-to-wiki)'
os.path.normpath('(path-to-wiki)/foo/bar.txt').startswith('(path-to-wiki)')
To ensure that the user's entered path/filename doesn't contain anything nasty, you can force the user to enter a path or filename to either of Lower/Upper Alpha, Numeric Digits or may be hyphen or underscore.
Then you can always check the normalized filename using a similar regular expression
userpath=os.path.normpath('(path-to-wiki)/foo/bar.txt')
re.findall(r'[^A-Za-z0-9_\-\\]',userpath)
To summarize
if userpath=os.path.normpath('(path-to-wiki)/foo/bar.txt')
then
if not os.path.normpath('(path-to-wiki)/foo/bar.txt').startswith('(path-to-wiki)')
or re.search(r'[^A-Za-z0-9_\-\\]',userpath):
... Do what ever you want with an invalid path