Pandas plot function ignores timezone of timeseries
from pytz import timezone as ptz
import matplotlib as mpl
...
data.index = pd.to_datetime(data.index, utc=True).tz_localize(tz=ptz('<your timezone>'))
...
mpl.rcParams['timezone'] = data.index.tz.zone
... after which matplotlib prints as that zone rather than UTC.
However! Note if you need to annotate, the x locations of the annotations will still need to be in UTC, even whilst strings passed to data.loc[] or data.at[] will be assumed to be in the set timezone!
For instance I needed to show a series of vertical lines labelled with timestamps on them: (this is after most of the plot calls, and note the timestamp strings in sels were UTC)
sels = ['2019-03-21 3:56:28',
'2019-03-21 4:00:30',
'2019-03-21 4:05:55',
'2019-03-21 4:13:40']
ax.vlines(sels,125,145,lw=1,color='grey') # 125 was bottom, 145 was top in data units
for s in sels:
tstr = pd.to_datetime(s, utc=True)\
.astimezone(tz=ptz(data.index.tz.zone))\
.isoformat().split('T')[1].split('+')[0]
ax.annotate(tstr,xy=(s,125),xycoords='data',
xytext=(0,5), textcoords='offset points', rotation=90,
horizontalalignment='right', verticalalignment='bottom')
This puts grey vertical lines at the times chosen manually in sels
, and labels them in local timezone hours, minutes and seconds. (the .split()[]
business discards the date and timezone info from the .isoformat()
string).
But when I need to actually get corresponding values from data using the same s
in sels
, I then have to use the somewhat awkward:
data.tz_convert('UTC').at[s]
Whereas just
data.at[s]
Fails with a KeyError
because pandas interprets s
is in the data.index.tz
timezone, and so interpreted, the timestamps fall outside of range of the contents of data
This is definitely a bug. I've created a report on github. The reason is because internally, pandas converts a regular frequency DatetimeIndex to PeriodIndex to hook into formatters/locators in pandas, and currently PeriodIndex does NOT retain timezone information. Please stay tuned for a fix.