What causes the JVM to do a major garbage collection?
I have found four conditions that can cause a major GC (given my JVM config):
- The old gen area is full (even if it can be grown, a major GC will still be run first)
- The perm gen area is full (even if it can be grown, a major GC will still be run first)
- Someone is manually calling
System.gc()
: a bad library or something related to RMI (see links 1, 2 and 3) - The young gen areas are all full and nothing is ready to be moved into old gen (see 1)
As others have commented, cases 1 and 2 can be improved by allocating plenty of heap and permgen, and setting -Xms
and -Xmx
to the same value (along with the perm equivalents) to avoid dynamic heap resizing.
Case 3 can be avoided using the -XX:+DisableExplicitGC
flag.
Case 4 requires more involved tuning, e.g., -XX:NewRatio=N
(see Oracle's tuning guide).
Garbage collection is a pretty complicated topic, and while you could learn all the details about this, I think what’s happening in your case is pretty simple.
Sun’s Garbage Collection Tuning guide, under the “Explicit Garbage Collection” heading, warns:
applications can interact with garbage collection … by invoking full garbage collections explicitly … This can force a major collection to be done when it may not be necessary … One of the most commonly encountered uses of explicit garbage collection occurs with RMI … RMI forces full collections periodically
That guide says that the default time between garbage collections is one minute, but the sun.rmi Properties reference, under sun.rmi.dgc.server.gcInterval
says:
The default value is 3600000 milliseconds (one hour).
If you’re seeing major collections every hour in one application but not another, it’s probably because the application is using RMI, possibly only internally, and you haven’t added -XX:+DisableExplicitGC
to the startup flags.
Disable explicit GC, or test this hypothesis by setting -Dsun.rmi.dgc.server.gcInterval=7200000
and observing if GCs happen every two hours instead.
It depends on your configurations, since HotSpot configures itself differently in different Java environments. For example, in a server with more than 2GB and two processors some JVMs will be configured in '-server' mode instead of the default '-client' mode, which configure the sizes of the memory spaces (generations) differently, and that has an impact as to when garbage collection will occur.
A full GC can occur automatically, but also if you call the garbage collector in your code (ex: using System.gc()
). Automatically, it depends on how the minor collections are behaving.
There are at least two algorithms being used. If you are using defaults, a copying algorithm is used for minor collections, and a mark-sweep algorithm for major collections.
A copying algorithm consists of copying used memory from one block to another, and then clearing the space containing the blocks with no references to them. The copying algorithm in the JVM uses uses a large area for objects that are created for the first time (called Eden
), and two smaller ones (called survivors
). Surviving objects are copied once from Eden
and several times from the survivor
spaces during each minor collection until they become tenured and are copied to another space (called tenured
space) where they can only be removed in a major collection.
Most of the objects in Eden
die quickly, so the first collection copies the surviving objects to the survivor spaces (which are by default much smaller). There are two survivors s1
and s2
. Every time the Eden
fills, the surviving objects from Eden
and s1
are copied to s2
, Eden
and s1
are cleared. Next time, survivors from Eden
and s2
are copied back to s1
. They keep on being copied from s1
to s2
to s1
until a certain number of copies is reached, or because a block is too big and doesn't fit, or some other criteria. Then the surviving memory block is copied to the tenured
generation.
The tenured
objects are not affected by the minor collections. They accumulate until the area gets full (or the garbage collector is called). Then the JVM will run a mark-sweep algorithm in a major collection which will preserve only the surviving objects that still have references.
If you have larger objects that don't fit into the survivors, they might be copied directly to the tenured
space, which will fill more quickly and you will get major collections more frequently.
Also, the sizes of the survivor spaces, amount of copies between s1
and s2
, Eden
size related to the size of s1
and s2
, size of the tenured generation, all these may be automatically configured differently in different environments with JVM ergonomics, which may automatically select a -server
or -client
behavior. You might try to run both JVMs as -server
or -client
and check if they still behave differently.