java heap analysis with oql: Count unique strings

The following is based on the answer by Peter Dolberg and can be used in the VisualVM OQL Console:

var counts={};
var alreadyReturned={};

filter(
  sort(
    map(heap.objects("java.lang.String"),
    function(heapString){
      if( ! counts[heapString.toString()]){
        counts[heapString.toString()] = 1;
      } else {
        counts[heapString.toString()] = counts[heapString.toString()] + 1;
      }
      return { string:heapString.toString(), count:counts[heapString.toString()]};
    }), 
    'lhs.count < rhs.count'),
  function(countObject) {
    if( ! alreadyReturned[countObject.string]){
      alreadyReturned[countObject.string] = true;
      return true;
    } else {
      return false;
    }
   }
  );

It starts by using a map() call over all String instances and for each String creating or updating an object in the counts array. Each object has a string and a count field.

The resulting array will contain one entry for each String instance, each having a count value one larger than the previous entry for the same String. The result is then sorted on the count field and the result looks something like this:

{
count = 1028.0,
string = *null*
}

{
count = 1027.0,
string = *null*
}

{
count = 1026.0,
string = *null*
}

...

(in my test the String "*null*" was the most common).

The last step is to filter this using a function that returns true for the first occurrence of each String. It uses the alreadyReturned array to keep track of which Strings have already been included.


I would use Eclipse Memory Analyzer instead.