Get top n elements from ruby array of hash values

This solution is not elegant in terms of being concise, but it has better time complexity. In other words, it should execute a lot faster for a very large number of hashes.

You will need to install the "algorithms" gem in order to use the Heap data structure:

Heaps are an efficient data structure when you need to find the largest or smallest elements in a group. This particular type of heap is optimal if the value of "n" is much smaller than the total number of pairs.

require 'algorithms'
def take_highest(result,n)
  max_heap = Containers::Heap.new(result){|x,y| (x["count"] <=> y["count"]) == 1}
  last = max_heap.pop
  count = 0
  highest = [last]
  loop do   
    top = max_heap.pop
    break if top.nil?
    count += (top["count"] == last["count"] ? 0 : 1)
    break if count == n
    highest << top
    last = top
  end
  highest
end

Enumerable#group_by to the rescue (as usual):

result.group_by { |r| r["count"] }
      .sort_by  { |k, v| -k }
      .first(2)
      .map(&:last)
      .flatten

Most of the work is done by the group_by. The sort_by simply lines things up so that first(2) will pick off the groups you want. Then map with last will extract the count/name hashes that you started with and the final flatten will clean up the extra left over arrays.


Starting in Ruby 2.2.0, max_by takes an extra argument that lets you ask for a certain number of top elements instead of just getting one. Using this, we can improve on mu is too short's answer

result = [
           {count: 3, name: 'user1'},
           {count: 10, name: 'user2'},
           {count: 10, name: 'user3'},
           {count: 2, name: 'user4'}
         ]
p result.group_by { |r| r[:count] }
      .max_by(2, &:first)
      .flat_map(&:last)
      .sort_by { |r| -r[:count] }
# => [{:count=>10, :name=>"user2"}, {:count=>10, :name=>"user3"}, {:count=>3, :name=>"user1"}]

The docs don't say if the array returned by max_by is sorted. If that turns out to be true though, we could just use reverse in the last step instead of sorting.


new_result = result.
  sort_by { |r| -r["count"] }.
  chunk { |r| r["count"] }.
  take(2).
  flat_map(&:last)

#=> [{"count"=>10, "name"=>"user3"}, 
#    {"count"=>10, "name"=>"user2"}, 
#    {"count"=> 3  "name"=>"user1"}]