Percent to total in PostgreSQL without subquery
I guess the reason you want to eliminate the subquery is to avoid scanning the users table twice. Remember the total is the sum of the counts for each country.
WITH c AS (
SELECT
country_id,
count(*) AS cnt
FROM users
WHERE cond1=...
GROUP BY country_id
)
SELECT
*,
100.0 * cnt / (SELECT sum(cnt) FROM c) AS percent
FROM c;
This query builds a small CTE with the per-country statistics. It will only scan the users table once, and generate a small result set (only one row per country).
The total (SELECT sum(cnt) FROM c) is calculated only once on this small result set, so it uses negligible time.
You could also use a window function :
SELECT
country_id,
cnt,
100.0 * cnt / (sum(cnt) OVER ()) AS percent
FROM (
SELECT country_id, count(*) as cnt from users group by country_id
) foo;
(which is the same as nightwolf's query with the errors removed lol )
Both queries take about the same time.
I am not a PostgreSQL user but, the general solution would be to use window functions.
Read up on how to use this at http://developer.postgresql.org/pgdocs/postgres/tutorial-window.html
Best explanation i could use to describe it is: basically it allows you to do a group by on one field without the group by clause.
I believe this might do the trick:
SELECT
country_id,
COUNT(*) OVER (country_id)
((((COUNT(*) OVER (country_id)) * 100) / COUNT(*) OVER () )::decimal) as percent
FROM
users
WHERE
cond1 = true AND cond2 = true AND cond3 = true
This is really old, but both of the select examples above either don't work, or are overly complex.
SELECT
country_id,
COUNT(*),
(COUNT(*) / (SUM(COUNT(*)) OVER() )) * 100
FROM
users
WHERE
cond1 = true AND cond2 = true AND cond3 = true
GROUP BY
country_id
The second count is not necessary, it's just for debugging to ensure you're getting the right results. The trick is the SUM on top of the COUNT over the recordset.
Hope this helps someone.
Also, if anyone wants to do this in Django, just hack up an aggregate:
class PercentageOverRecordCount(Aggregate):
function = 'OVER'
template = '(COUNT(*) / (SUM(COUNT(*)) OVER() )) * 100'
def __init__(self, expression, **extra):
super().__init__(
expression,
output_field=DecimalField(),
**extra
)
Now it can be used in annotate.