Java's WeakHashMap and caching: Why is it referencing the keys, not the values?

WeakHashMap isn't useful as a cache, at least the way most people think of it. As you say, it uses weak keys, not weak values, so it's not designed for what most people want to use it for (and, in fact, I've seen people use it for, incorrectly).

WeakHashMap is mostly useful to keep metadata about objects whose lifecycle you don't control. For example, if you have a bunch of objects passing through your class, and you want to keep track of extra data about them without needing to be notified when they go out of scope, and without your reference to them keeping them alive.

A simple example (and one I've used before) might be something like:

WeakHashMap<Thread, SomeMetaData>

where you might keep track of what various threads in your system are doing; when the thread dies, the entry will be removed silently from your map, and you won't keep the Thread from being garbage collected if you're the last reference to it. You can then iterate over the entries in that map to find out what metadata you have about active threads in your system.

See WeakHashMap in not a cache! for more information.

For the type of cache you're after, either use a dedicated cache system (e.g. EHCache) or look at Guava's MapMaker class; something like

new MapMaker().weakValues().makeMap();

will do what you're after, or if you want to get fancy you can add timed expiration:

new MapMaker().weakValues().expiration(5, TimeUnit.MINUTES).makeMap();

The main use for WeakHashMap is when you have mappings which you want to disappear when their keys disappear. A cache is the reverse---you have mappings which you want to disappear when their values disappear.

For a cache, what you want is a Map<K,SoftReference<V>>. A SoftReference will be garbage-collected when memory gets tight. (Contrast this with a WeakReference, which may be cleared as soon as there is no longer a hard reference to its referent.) You want your references to be soft in a cache (at least in one where key-value mappings don't go stale), since then there is a chance that your values will still be in the cache if you look for them later. If the references were weak instead, your values would be gc'd right away, defeating the purpose of caching.

For convenience, you might want to hide the SoftReference values inside your Map implementation, so that your cache appears to be of type <K,V> instead of <K,SoftReference<V>>. If you want to do that, this question has suggestions for implementations available on the net.

Note also that when you use SoftReference values in a Map, you must do something to manually remove key-value pairs which have had their SoftReferences cleared---otherwise your Map will only grow in size forever, and leak memory.


Another thing to consider is that if you take the Map<K, WeakReference<V>> approach, the value may disappear, but the mapping will not. Depending on usage, you may as a result end up with a Map containing many entries whose Weak References have been GC'd.