How to implement a Median-heap
Isn't a perfectly balanced binary search tree (BST) a median heap? It is true that even red-black BSTs aren't always perfectly balanced, but it might be close enough for your purposes. And log(n) performance is guaranteed!
AVL trees are more tighly balanced than red-black BSTs so they come even closer to being a true median heap.
Here is a java implementaion of a MedianHeap, developed with the help of above comocomocomocomo 's explanation .
import java.util.Arrays;
import java.util.Comparator;
import java.util.PriorityQueue;
import java.util.Scanner;
/**
*
* @author BatmanLost
*/
public class MedianHeap {
//stores all the numbers less than the current median in a maxheap, i.e median is the maximum, at the root
private PriorityQueue<Integer> maxheap;
//stores all the numbers greater than the current median in a minheap, i.e median is the minimum, at the root
private PriorityQueue<Integer> minheap;
//comparators for PriorityQueue
private static final maxHeapComparator myMaxHeapComparator = new maxHeapComparator();
private static final minHeapComparator myMinHeapComparator = new minHeapComparator();
/**
* Comparator for the minHeap, smallest number has the highest priority, natural ordering
*/
private static class minHeapComparator implements Comparator<Integer>{
@Override
public int compare(Integer i, Integer j) {
return i>j ? 1 : i==j ? 0 : -1 ;
}
}
/**
* Comparator for the maxHeap, largest number has the highest priority
*/
private static class maxHeapComparator implements Comparator<Integer>{
// opposite to minHeapComparator, invert the return values
@Override
public int compare(Integer i, Integer j) {
return i>j ? -1 : i==j ? 0 : 1 ;
}
}
/**
* Constructor for a MedianHeap, to dynamically generate median.
*/
public MedianHeap(){
// initialize maxheap and minheap with appropriate comparators
maxheap = new PriorityQueue<Integer>(11,myMaxHeapComparator);
minheap = new PriorityQueue<Integer>(11,myMinHeapComparator);
}
/**
* Returns empty if no median i.e, no input
* @return
*/
private boolean isEmpty(){
return maxheap.size() == 0 && minheap.size() == 0 ;
}
/**
* Inserts into MedianHeap to update the median accordingly
* @param n
*/
public void insert(int n){
// initialize if empty
if(isEmpty()){ minheap.add(n);}
else{
//add to the appropriate heap
// if n is less than or equal to current median, add to maxheap
if(Double.compare(n, median()) <= 0){maxheap.add(n);}
// if n is greater than current median, add to min heap
else{minheap.add(n);}
}
// fix the chaos, if any imbalance occurs in the heap sizes
//i.e, absolute difference of sizes is greater than one.
fixChaos();
}
/**
* Re-balances the heap sizes
*/
private void fixChaos(){
//if sizes of heaps differ by 2, then it's a chaos, since median must be the middle element
if( Math.abs( maxheap.size() - minheap.size()) > 1){
//check which one is the culprit and take action by kicking out the root from culprit into victim
if(maxheap.size() > minheap.size()){
minheap.add(maxheap.poll());
}
else{ maxheap.add(minheap.poll());}
}
}
/**
* returns the median of the numbers encountered so far
* @return
*/
public double median(){
//if total size(no. of elements entered) is even, then median iss the average of the 2 middle elements
//i.e, average of the root's of the heaps.
if( maxheap.size() == minheap.size()) {
return ((double)maxheap.peek() + (double)minheap.peek())/2 ;
}
//else median is middle element, i.e, root of the heap with one element more
else if (maxheap.size() > minheap.size()){ return (double)maxheap.peek();}
else{ return (double)minheap.peek();}
}
/**
* String representation of the numbers and median
* @return
*/
public String toString(){
StringBuilder sb = new StringBuilder();
sb.append("\n Median for the numbers : " );
for(int i: maxheap){sb.append(" "+i); }
for(int i: minheap){sb.append(" "+i); }
sb.append(" is " + median()+"\n");
return sb.toString();
}
/**
* Adds all the array elements and returns the median.
* @param array
* @return
*/
public double addArray(int[] array){
for(int i=0; i<array.length ;i++){
insert(array[i]);
}
return median();
}
/**
* Just a test
* @param N
*/
public void test(int N){
int[] array = InputGenerator.randomArray(N);
System.out.println("Input array: \n"+Arrays.toString(array));
addArray(array);
System.out.println("Computed Median is :" + median());
Arrays.sort(array);
System.out.println("Sorted array: \n"+Arrays.toString(array));
if(N%2==0){ System.out.println("Calculated Median is :" + (array[N/2] + array[(N/2)-1])/2.0);}
else{System.out.println("Calculated Median is :" + array[N/2] +"\n");}
}
/**
* Another testing utility
*/
public void printInternal(){
System.out.println("Less than median, max heap:" + maxheap);
System.out.println("Greater than median, min heap:" + minheap);
}
//Inner class to generate input for basic testing
private static class InputGenerator {
public static int[] orderedArray(int N){
int[] array = new int[N];
for(int i=0; i<N; i++){
array[i] = i;
}
return array;
}
public static int[] randomArray(int N){
int[] array = new int[N];
for(int i=0; i<N; i++){
array[i] = (int)(Math.random()*N*N);
}
return array;
}
public static int readInt(String s){
System.out.println(s);
Scanner sc = new Scanner(System.in);
return sc.nextInt();
}
}
public static void main(String[] args){
System.out.println("You got to stop the program MANUALLY!!");
while(true){
MedianHeap testObj = new MedianHeap();
testObj.test(InputGenerator.readInt("Enter size of the array:"));
System.out.println(testObj);
}
}
}
Here my code based on the answer provided by comocomocomocomo :
import java.util.PriorityQueue;
public class Median {
private PriorityQueue<Integer> minHeap =
new PriorityQueue<Integer>();
private PriorityQueue<Integer> maxHeap =
new PriorityQueue<Integer>((o1,o2)-> o2-o1);
public float median() {
int minSize = minHeap.size();
int maxSize = maxHeap.size();
if (minSize == 0 && maxSize == 0) {
return 0;
}
if (minSize > maxSize) {
return minHeap.peek();
}if (minSize < maxSize) {
return maxHeap.peek();
}
return (minHeap.peek()+maxHeap.peek())/2F;
}
public void insert(int element) {
float median = median();
if (element > median) {
minHeap.offer(element);
} else {
maxHeap.offer(element);
}
balanceHeap();
}
private void balanceHeap() {
int minSize = minHeap.size();
int maxSize = maxHeap.size();
int tmp = 0;
if (minSize > maxSize + 1) {
tmp = minHeap.poll();
maxHeap.offer(tmp);
}
if (maxSize > minSize + 1) {
tmp = maxHeap.poll();
minHeap.offer(tmp);
}
}
}
You need two heaps: one min-heap and one max-heap. Each heap contains about one half of the data. Every element in the min-heap is greater or equal to the median, and every element in the max-heap is less or equal to the median.
When the min-heap contains one more element than the max-heap, the median is in the top of the min-heap. And when the max-heap contains one more element than the min-heap, the median is in the top of the max-heap.
When both heaps contain the same number of elements, the total number of elements is even. In this case you have to choose according your definition of median: a) the mean of the two middle elements; b) the greater of the two; c) the lesser; d) choose at random any of the two...
Every time you insert, compare the new element with those at the top of the heaps in order to decide where to insert it. If the new element is greater than the current median, it goes to the min-heap. If it is less than the current median, it goes to the max heap. Then you might need to rebalance. If the sizes of the heaps differ by more than one element, extract the min/max from the heap with more elements and insert it into the other heap.
In order to construct the median heap for a list of elements, we should first use a linear time algorithm and find the median. Once the median is known, we can simply add elements to the Min-heap and Max-heap based on the median value. Balancing the heaps isn't required because the median will split the input list of elements into equal halves.
If you extract an element you might need to compensate the size change by moving one element from one heap to another. This way you ensure that, at all times, both heaps have the same size or differ by just one element.