How to iterate over rows in a DataFrame in Pandas
Let say we have a DataFrame from Pandas:
import pandas as pd
inp = [{'colum1':1, 'colum1':2}, {'colum1':3,'colum1':4}, {'colum1':5,'colum1':6}]
df = pd.DataFrame(inp)
print df
Ouptut:
colum1 colum2
0 1 2
1 3 4
2 5 6
We now want to iterate over the rows of this frame by using DataFrame.iterrows() and DataFrame.itertuples()
There are 2 methods to do this:
Method 1: Using DataFrame.iterrows() #
DataFrame.iterrows is a generator which yields both the index and row (as a Series):
import pandas as pd
df = pd.DataFrame({'column1': [1, 3, 4], 'column2': [4, 5, 6]})
for index, row in df.iterrows():
print(row['column1'], row['column2'])
Output:
1 4
2 5
3 6
Method 2: Using DataFrame.itertuples() #
import pandas as pd
df = pd.DataFrame({'column1': [1, 3, 4], 'column2': [4, 5, 6]})
for index, row in df.itertuples():
print(row['column1'], row['column2'])
Output:
1 4
2 5
3 6
It is reported that itertuples()
if much faster than iterrows()