Pandas df.itertuples renaming dataframe columns when printing
This seems to be an issue with handling column names having spaces in them. If you replace the column names with different ones without spaces, it will work:
df.columns = ['us_qqq_equity', 'us_spy_equity']
# df.columns = df.columns.str.replace(r'\s+', '_') # Courtesy @MaxU
for r in df.head().itertuples():
print(r)
# Pandas(Index='2017-06-19', us_qqq_equity=0.0, us_spy_equity=1.0)
# Pandas(Index='2017-06-20', us_qqq_equity=0.0, us_spy_equity=-1.0)
# ...
Column names with spaces cannot effectively be represented in named tuples, so they are renamed automatically when printing.
Interesting observation: out of DataFrame.iterrows()
, DataFrame.iteritems()
, DataFrame.itertuples()
only the last one renames the columns, containing spaces:
In [140]: df = df.head(3)
In [141]: list(df.iterrows())
Out[141]:
[(Timestamp('2017-06-19 00:00:00'), us qqq equity 0.0
us spy equity 1.0
Name: 2017-06-19 00:00:00, dtype: float64),
(Timestamp('2017-06-20 00:00:00'), us qqq equity 0.0
us spy equity -1.0
Name: 2017-06-20 00:00:00, dtype: float64),
(Timestamp('2017-06-21 00:00:00'), us qqq equity 0.0
us spy equity 0.0
Name: 2017-06-21 00:00:00, dtype: float64)]
In [142]: list(df.iteritems())
Out[142]:
[('us qqq equity', date
2017-06-19 0.0
2017-06-20 0.0
2017-06-21 0.0
Name: us qqq equity, dtype: float64), ('us spy equity', date
2017-06-19 1.0
2017-06-20 -1.0
2017-06-21 0.0
Name: us spy equity, dtype: float64)]
In [143]: list(df.itertuples())
Out[143]:
[Pandas(Index=Timestamp('2017-06-19 00:00:00'), _1=0.0, _2=1.0),
Pandas(Index=Timestamp('2017-06-20 00:00:00'), _1=0.0, _2=-1.0),
Pandas(Index=Timestamp('2017-06-21 00:00:00'), _1=0.0, _2=0.0)]