How to check if .xls and .csv files are empty
I don't think Stackoverflow allows 2 question at the time but let me give you my answer for the Excel part
import xlrd
from pprint import pprint
wb = xlrd.open_workbook("temp.xlsx")
empty_sheets = [sheet for sheet in wb.sheets() if sheet.ncols == 0]
non_empty_sheets = [sheet for sheet in wb.sheets() if sheet.ncols > 0]
# printing names of empty sheets
pprint([sheet.name for sheet in empty_sheets])
# writing non empty sheets to database
pass # write code yourself or ask another question
About the header: let me give you a little hint, test for sheet.nrows == 1
.
Question 1: How I check the entire .xls file are empty.
def readfile(fullpath):
xls = xlrd.open_workbook(fullpath)
is_empty = None
for sheet in xls.sheets():
number_of_rows = sheet.nrows
if number_of_rows == 1:
header = sheet.row_values(0)
# then If it contains only headers I want to treat as empty
if header:
is_empty = False
break
if number_of_rows > 1:
is_empty = False
break
number_of_columns = sheet.ncols
sheetname = sheet.name
if is_empty:
print('xlsx ist empty')
Question 2: How I check header of the file .If the file has only a header(I mean only a single row) I need to treat the file is empty .How can I do that.
import csv
with open('test/empty.csv', 'r') as csvfile:
csv_dict = [row for row in csv.DictReader(csvfile)]
if len(csv_dict) == 0:
print('csv file is empty')
Tested with Python:3.4.2
This is simple in pandas with the .empty method. Do this
import pandas as pd
df = pd.read_csv(filename) # or pd.read_excel(filename) for xls file
df.empty # will return True if the dataframe is empty or False if not.
This will also return True for a file with only headers as in
>> df = pd.DataFrame(columns = ['A','B'])
>> df.empty
True