Why can't I use column aliases in the next SELECT expression?
You can use a previously created alias in the GROUP BY
or HAVING
statement but not in a SELECT
or WHERE
statement. This is because the program processes all of the SELECT
statement at the same time and doesn't know the alias' value yet.
The solution is to encapsulate the query in a subquery and then the alias is available outside.
SELECT stddev_time, max_time, avg_time, min_time, cnt,
ROUND(avg_time * cnt, 2) as slowdown
FROM (
SELECT
COALESCE(ROUND(stddev_samp(time), 2), 0) as stddev_time,
MAX(time) as max_time,
ROUND(AVG(time), 2) as avg_time,
MIN(time) as min_time,
COUNT(path) as cnt,
path
FROM
loadtime
GROUP BY
path
ORDER BY
avg_time DESC
LIMIT 10
) X;
The order of execution of a query (and thus the evaluation of expressions and aliases) is NOT the same as the way it is written. The "general" position is that the clauses are evaluated in this sequence:
FROM
WHERE
GROUP BY
HAVING
SELECT
ORDER BY
Hence the column aliases are unknown to most of the query until the select clause is complete (and this is why you can use aliases in the ORDER BY clause). However table aliases which are established in the from clause are understood in the where to order by clauses.
The most common workaround is to encapsulate your query into a "derived table"
Suggested reading: Order Of Execution of the SQL query
Note: different SQL dbms have different specific rules regarding use of aliases
EDIT
The purpose behind reminding readers of the logical clause sequence is that often (but not always) aliases only becomes referable AFTER the clause where the alias is declared. The most common of which is that aliases declared in the SELECT
clause can be used by the ORDER BY
clause. In particular, an alias declared in a SELECT
clause cannot be referenced within the same SELECT
clause.
But please do note that due to differences in products not every dbms will behave in this manner
Aliases are not available until the virtual relation is actually created, if you want to do additional expressions using the aliases themselves you will have to create the virtual relation using as sub-query than run an additional query on top of it. So I would modify your query to the following:
SELECT stddev_time, max_time, avg_time, min_time, ROUND(avg_time * cnt, 2) as slowdown, path FROM
(
SELECT
COALESCE(ROUND(stddev_samp(time), 2), 0) as stddev_time,
MAX(time) as max_time,
ROUND(AVG(time), 2) as avg_time,
MIN(time) as min_time,
COUNT(path) as cnt,
ROUND(AVG(time) * COUNT(path), 2) as slowdown, path
FROM
loadtime
GROUP BY
path
ORDER BY
avg_time DESC
LIMIT 10;
)
I want to add here the reason your second query worked is because the query planner recognized those columns as defined directly in the table you're querying them from.