Pandas reading csv files with partial wildcard

glob returns a list, not a string. The read_csv function takes a string as the input to find the file. Try this:

for f in glob('somefile*.csv'):
    df = pd.read_csv(f)
    ...
    # the rest of your script

To read all of the files that follow a certain pattern, so long as they share the same schema, use this function:

import glob
import pandas as pd

def pd_read_pattern(pattern):
    files = glob.glob(pattern)

    df = pd.DataFrame()
    for f in files:
        df = df.append(pd.read_csv(f))

    return df.reset_index(drop=True)

df = pd_read_pattern('somefile*.csv')

This will work with either an absolute or relative path.

You can get the list of the CSV files in the script and loop over them.

from os import listdir
from os.path import isfile, join
mypath = os.getcwd()

csvfiles = [f for f in listdir(mypath) if isfile(join(mypath, f)) if '.csv' in f]

for f in csvfiles:
    pd.read_csv(f)
# the rest of your script

Loop over each file and build a list of DataFrame, then assemble them together using concat.

Pandas reading csv files with partial wildcard

Tags:

Python

Pandas

Related

Recent Posts