What is an appropriate data type to store a timezone?
Unfortunately PostgreSQL doesn't offer a time zone data type, so you should probably use text
.
interval
seems like a logical option at first glance, and it is appropriate for some uses. However, it fails to consider daylight savings time, nor does it consider the fact that different regions in the same UTC offset have different DST rules.
There is not a 1:1 mapping from UTC offset back to time zone.
For example, the time zone for Australia/Sydney
(New South Wales) is UTC+10
(EST
), or UTC+11
(EDT
) during daylight savings time. Yes, that's the same acronym EST
that the USA uses; time zone acronyms are non-unique in the tzdata database, which is why Pg has the timezone_abbreviations
setting. Worse, Brisbane (Queensland) is at almost the same longditude and is in UTC+10 EST
... but doesn't have daylight savings, so sometime it's at a -1
offset to New South Wales during NSW's DST.
(Update: More recently Australia adopted an A
prefix, so it uses AEST
as its eastern states TZ acronym, but EST
and WST
remain in common use).
Confusing much?
If all you need to store is a UTC offset then an interval
is appropriate. If you want to store a time zone, store it as text
. It's a pain to validate and to convert to a time zone offset at the moment, but at least it copes with DST.
"+hh:mm" and "-hh:mm" are not time zones, they are UTC offsets. A good format to save those are as a signed integer with the offset in minutes. You can also use things like interval
but that will only help you if you want to do date calculations directly in PostgreSQL, like in a query, etc. Usually though you do these calculations in another language, and then it depends on that language if it supports the interval
type well and has a good date/time library or not. But converting an integer into some sort of interval
-like type, like Pythons timedelta
should be trivial, so I would personally just store it as an integer.
Time zones have names, and although there are no standardized names for the time zones there is one de facto standard in the "tz" or "zoneinfo" database, and that's names like "Europe/Paris", "Americas/New_York" or "US/Pacific". Those should be stored as strings.
Windows uses completely different names, such as "Romance time" (don't ask). You can store them as well as strings, but I would avoid it, these names aren't used outside Windows, and the names make no sense. Besides, translated versions of windows tend to use translated names for these timezones, making it even worse.
Abbreviations like "PDT" and "EST" are not usable as time zone names, because they are not unique. There is four (I think, or was it five?) different time zones all called "CST", so that's not usable.
In short: For time zones, store the name as a string. For UTC offsets, store the offset in minutes as a signed integer.
In an ideal world you could have a foreign key to a set of known timezones. You can do something close to this with views and domains.
This wiki tip by David E. Wheleer creates a domain that is tested for its validity as a timezone:
CREATE OR REPLACE FUNCTION is_timezone( tz TEXT ) RETURNS BOOLEAN as $$
BEGIN
PERFORM now() AT TIME ZONE tz;
RETURN TRUE;
EXCEPTION WHEN invalid_parameter_value THEN
RETURN FALSE;
END;
$$ language plpgsql STABLE;
CREATE DOMAIN timezone AS CITEXT
CHECK ( is_timezone( value ) );
It's useful to have a list of known timezones, in which case you could dispense with the domain and just enforce the constraint in the one table containing the known timezone names (obtained from the view pg_timezone_names
), avoiding the need to expose the domain elsewhere:
CREATE TABLE tzone
(
tzone_name text PRIMARY KEY (tzone_name) CHECK (is_timezone(tzone_name))
);
INSERT INTO tzone (tzone_name)
SELECT name FROM pg_timezone_names;
Then you can enforce correctness through foreign keys:
CREATE TABLE myTable (
...
tzone TEXT REFERENCES tzone(tzone_name)
);