Why does the Java Scheduler exhibit significant time drift on Windows?
As pointed out in the comments, the ScheduledThreadPoolExecutor
bases its calculations on System.nanoTime()
. For better or worse, the old Timer
API however preceeded nanoTime()
, and so uses System.currentTimeMillis()
instead.
The difference here might seem subtle, but is more significant than one might expect. Contrary to popular belief, nanoTime()
is not just a "more accurate version" of currentTimeMillis()
. Millis is locked to system time, whereas nanos is not. Or as the docs put it:
This method can only be used to measure elapsed time and is not related to any other notion of system or wall-clock time. [...] The values returned by this method become meaningful only when the difference between two such values, obtained within the same instance of a Java virtual machine, is computed.
In your example, you're not following this guidance for the values to be "meaningful" - understandably, because the ScheduledThreadPoolExecutor
only uses nanoTime()
as an implementation detail. But the end result is the same, that being that you can't guarantee that it will stay synchronised to the system clock.
But why not? Seconds are seconds, right, so the two should stay in sync from a certain, known point?
Well, in theory, yes. But in practice, probably not.
Taking a look at the relevant native code on windows:
LARGE_INTEGER current_count;
QueryPerformanceCounter(¤t_count);
double current = as_long(current_count);
double freq = performance_frequency;
jlong time = (jlong)((current/freq) * NANOSECS_PER_SEC);
return time;
We see nanos()
uses the QueryPerformanceCounter
API, which works by QueryPerformanceCounter
getting the "ticks" of a frequency that's defined by QueryPerformanceFrequency
. That frequency will stay identical, but the timer it's based off, and its synchronistaion algorithm that windows uses, varies by configuration, OS, and underlying hardware. Even ignoring the above, it's never going to be close to 100% accurate (it's based of a reasonably cheap crystal oscillator somewhere on the board, not a Caesium time standard!) so it's going to drift out with the system time as NTP keeps it in sync with reality.
In particular, this link gives some useful background, and reinforces the above pont:
When you need time stamps with a resolution of 1 microsecond or better and you don't need the time stamps to be synchronized to an external time reference, choose QueryPerformanceCounter.
(Bolding is mine.)
For your specific case of Windows 7 performing badly, note that in Windows 8+, the TSC synchronisation algorithm was improved, and QueryPerformanceCounter
was always based on a TSC (as oppose to Windows 7, where it could be a TSC, HPET or the ACPI PM timer - the latter of which is especially rather inaccurate.) I suspect this is the most likely reason the situation improves tremendously on Windows 10.
That being said, the above factors still mean that you can't rely on the ScheduledThreadPoolExecutor
to keep in time with "real" time - it will always drift. If that drift is an issue, then it's not a solution you can rely on in this context.
Side note: In Windows 8+, there is a GetSystemTimePreciseAsFileTime
function which offers the high resolution of QueryPerformanceCounter
combined with the accuracy of the system time. If Windows 7 was dropped as a supported platform, this could in theory be used to provide a System.getCurrentTimeNanos()
method or similar, assuming other similar native functions exist for other supported platforms.
CronScheduler is a project of mine designed to be proof against time drift problem, and at the same time it avoids some of the problems with the old Timer
class described in this post.
Example usage:
Duration syncPeriod = Duration.ofMinutes(1);
CronScheduler cron = CronScheduler.create(syncPeriod);
cron.scheduleAtFixedRateSkippingToLatest(0, 1, TimeUnit.MINUTES, runTimeMillis -> {
// Collect and send summary metrics to a remote monitoring system
});
Note: this project was actually inspired by this StackOverflow question.