Pandas reading csv files with partial wildcard
glob
returns a list, not a string. The read_csv
function takes a string as the input to find the file. Try this:
for f in glob('somefile*.csv'):
df = pd.read_csv(f)
...
# the rest of your script
To read all of the files that follow a certain pattern, so long as they share the same schema, use this function:
import glob
import pandas as pd
def pd_read_pattern(pattern):
files = glob.glob(pattern)
df = pd.DataFrame()
for f in files:
df = df.append(pd.read_csv(f))
return df.reset_index(drop=True)
df = pd_read_pattern('somefile*.csv')
This will work with either an absolute or relative path.
You can get the list of the CSV files in the script and loop over them.
from os import listdir
from os.path import isfile, join
mypath = os.getcwd()
csvfiles = [f for f in listdir(mypath) if isfile(join(mypath, f)) if '.csv' in f]
for f in csvfiles:
pd.read_csv(f)
# the rest of your script
Loop over each file and build a list of DataFrame, then assemble them together using concat
.