Need of using 'r' before path-name while reading a csv file with pandas
- This solution by Denziloe does a perfect job of explaining why
r
may precede a path string.r'C:\Users\username'
worksr'C:\Users\username\'
does not, because the trailing\
escapes the'
.r'C:\Users\username\' + file
, wherefile = 'test.csv'
also won't work- Results in
SyntaxError: EOL while scanning string literal
pandas
methods that will read a file, such aspandas.read_csv
will accept astr
or apathlib
object for a file path.- If you need to iterate through a list a file names you can add them with an
f-string
as well.num = 6
,f'I have {num} files'
interprets as'I have 6 files'
, is an example of using anf-string
.
import pandas as pd
files = ['test1.csv', 'test2.csv', 'test3.csv']
df_list = list()
for file in files:
df_list.append(pd.read_csv(rf'C:\Users\username\{file}')) # path with f-string
df = pd.concat(df_list)
In Python, backslash is used to signify special characters.
For example, "hello\nworld"
-- the \n
means a newline. Try printing it.
Path names on Windows tend to have backslashes in them. But we want them to mean actual backslashes, not special characters.
r stands for "raw" and will cause backslashes in the string to be interpreted as actual backslashes rather than special characters.
e.g. r"hello\nworld"
literally means the characters "hello\nworld"
. Again, try printing it.
More info is in the Python docs, it's a good idea to search them for questions like these.
https://docs.python.org/3/tutorial/introduction.html#strings
A raw string will handle back slashes in most cases, such as these two examples:
In [11]:
r'c:\path'
Out[11]:
'c:\\path'
However, if there is a trailing slash then it will break:
In [12]:
r'c:\path\'
File "<ipython-input-12-9995c7b1654a>", line 1
r'c:\path\'
^
SyntaxError: EOL while scanning string literal
Forward slashes doesn't have this problem:
In [13]:
r'c:/path/'
Out[13]:
'c:/path/'
The safe and portable method is to use forward slashes always and if building a string for a full path to use os.path
to correctly handle building a path that will work when the code is executed on different operating systems:
In [14]:
import os
path = 'c:/'
folder = 'path/'
os.path.join(path, folder)
Out[14]:
'c:/path/'