`uuuu` versus `yyyy` in `DateTimeFormatter` formatting pattern codes in Java?
Long story short
- For 99 % of purposes you can toss a coin, it will make no difference whether you use
yyyy
oruuuu
(or whether you useyy
oruu
for 2-digit year). - It depends on what you want to happen in case a year earlier than 1 CE (1 AD) occurs. The point being that in 99 % of programs such a year will never occur.
Two other answers have already presented the facts of how u
and y
work very nicely, but I still felt something was missing, so I am contributing the slightly more opinion-based answer.
For formatting
Assuming that you don’t expect a year before 1 CE to be formatted, the best thing you can do is to check this assumption and react appropriately in case it breaks. For example, depending on circumstances and requirements, you may print an error message or throw an exception. One very soft failure path might be to use a pattern with y
(year of era) and G
(era) in this case and a pattern with either u
or y
in the normal, current era case. Note that if you are printing the current date or the date your program was compiled, you can be sure that it is in the common era and may opt to skip the check.
For parsing
In many (most?) cases parsing also means validating meaning you have no guarantees what your input string looks like. Typically it comes from the user or from another system. An example: a date string comes as 2018-09-29. Here the choice between uuuu
and yyyy
should depend on what you want to happen in case the string contains a year of 0 or negative (e.g., 0000-08-17
or -012-11-13
). Assuming that this would be an error, the immediate answer is: use yyyy
in order for an exception to be thrown in this case. Still finer: use uuuu
and after parsing perform a range check of the parsed date. The latter approach allows both for a finer validation and for a better error message in case of a validation error.
Special case (already mentioned by Meno Hochschild): If your formatter uses strict resolver style and contains y
without G
, parsing will always fail because strictly speaking year of era is ambiguous without era: 1950 might mean 1950 CE or 1950 BCE (1950 BC). So in this case you need u
(or supplying a default era, this is possible through a DateTimeFormatterBuilder
).
Long story short again
Explicit range check of your dates, specifically your years, is better than relying on the choice between uuuu
and yyyy
for catching unexpected very early years.
In the javadoc section Patterns for Formatting and Parsing for DateTimeFormatter
it lists the following 3 relevant symbols:
Symbol Meaning Presentation Examples
------ ------- ------------ -------
G era text AD; Anno Domini; A
u year year 2004; 04
y year-of-era year 2004; 04
Just for comparison, these other symbols are easy enough to understand:
D day-of-year number 189
d day-of-month number 10
E day-of-week text Tue; Tuesday; T
The day-of-year
, day-of-month
, and day-of-week
are obviously the day within the given scope (year, month, week).
So, year-of-era
means the year within the given scope (era), and right above it era
is shown with an example value of AD
(the other value of course being BC
).
year
is the signed year, where year 0
is 1 BC
, year -1
is 2 BC
, and so forth.
To illustrate: When was Julius Caesar assassinated?
- March 15, 44 BC (using pattern
MMMM d, y GG
) - March 15, -43 (using pattern
MMMM d, u
)
The distinction will of course only matter if year is zero or negative, and since that is rare, most people don't care, even though they should.
Conclusion: If you use y
you should also use G
. Since G
is rarely used, the correct year symbol is u
, not y
, otherwise a non-positive year will show incorrectly.
This is known as defensive programming:
Defensive programming is a form of defensive design intended to ensure the continuing function of a piece of software under unforeseen circumstances.
Note that DateTimeFormatter
is consistent with SimpleDateFormat
:
Letter Date or Time Component Presentation Examples
------ ---------------------- ------------ --------
G Era designator Text AD
y Year Year 1996; 96
Negative years has always been a problem, and they now fixed it by adding u
.
Within the scope of java.time
-package, we can say:
It is safer to use "u" instead of "y" because
DateTimeFormatter
will otherwise insist on having an era in combination with "y" (= year-of-era). So using "u" would avoid some possible unexpected exceptions in strict formatting/parsing. See also this SO-post. Another minor thing which is improved by "u"-symbol compared with "y" is printing/parsing negative gregorian years (in far past).Otherwise we can clearly state that using "u" instead of "y" breaks long-standing habits in Java-programming. It is also not intuitively clear that "u" denotes any kind of year because a) the first letter of the English word "year" is not in agreement with this symbol and b)
SimpleDateFormat
has used "u" for a different purpose since Java-7 (ISO-day-number-of-week). Confusion is guaranteed - for ever?We should also see that using eras (symbol "G") in context of ISO is in general dangerous if we consider historic dates. If "G" is used with "u" then both fields are unrelated to each other. And if "G" is used with "y" then the formatter is satisfied but still uses proleptic gregorian calendar when the historic date mandates different calendars and date-handling.
Background information:
When developing and integrating the JSR 310 (java.time
-packages) the designers decided to use Common Locale Data Repository (CLDR)/LDML-spec as the base of pattern symbols in DateTimeFormatter
. The symbol "u" was already defined in CLDR as proleptic gregorian year, so this meaning was adopted to new upcoming JSR-310 (but not to SimpleDateFormat
because of backwards compatibility reasons).
However, this decision to follow CLDR was not quite consistent because JSR-310 had also introduced new pattern symbols which didn't and still don't exist in CLDR, see also this old CLDR-ticket. The suggested symbol "I" was changed by CLDR to "VV" and finally overtaken by JSR-310, including new symbols "x" and "X". But "n" and "N" still don't exist in CLDR, and since this old ticket is closed, it is not clear at all if CLDR will ever support it in the sense of JSR-310. Furthermore, the ticket does not mention the symbol "p" (padding instruction in JSR-310, but not defined in CLDR). So we have still no perfect agreement between pattern definitions across different libraries and languages.
And about "y": We should also not overlook the fact that CLDR associates this year-of-era with at least some kind of mixed Julian/Gregorian year and not with the proleptic gregorian year as JSR-310 does (leaving the oddity of negative years aside). So no perfect agreement between CLDR and JSR-310 here, too.