Calculating the Median with Mysql

I propose a faster way.

Get the row count:

SELECT CEIL(COUNT(*)/2) FROM data;

Then take the middle value in a sorted subquery:

SELECT max(val) FROM (SELECT val FROM data ORDER BY val limit @middlevalue) x;

I tested this with a 5x10e6 dataset of random numbers and it will find the median in under 10 seconds.

This will find an arbitrary percentile by replacing the COUNT(*)/2 with COUNT(*)*n where n is the percentile (.5 for median, .75 for 75th percentile, etc).


val is your time column, x and y are two references to the data table (you can write data AS x, data AS y).

EDIT: To avoid computing your sums twice, you can store the intermediate results.

CREATE TEMPORARY TABLE average_user_total_time 
      (SELECT SUM(time) AS time_taken 
            FROM scores 
            WHERE created_at >= '2010-10-10' 
                    and created_at <= '2010-11-11' 
            GROUP BY user_id);

Then you can compute median over these values which are in a named table.

EDIT: Temporary table won't work here. You could try using a regular table with "MEMORY" table type. Or just have your subquery that computes the values for the median twice in your query. Apart from this, I don't see another solution. This doesn't mean there isn't a better way, maybe somebody else will come with an idea.