What does [FROM x, y] mean in Postgres?
The manual has detailed explanation for the comma in the FROM
list in the chapter Table Expressions:
The
FROM
Clause derives a table from one or more other tables given in a comma-separated table reference list.
FROM table_reference [, table_reference [, ...]]
A table reference can be a table name (possibly schema-qualified), or a derived table such as a subquery, a
JOIN
construct, or complex combinations of these. If more than one table reference is listed in theFROM
clause, the tables are cross-joined (that is, the Cartesian product of their rows is formed; see below).
The fact that comma-separated table references have been defined in an earlier version of the SQL standard than explicit JOIN
syntax does not make the comma wrong or outdated. Use explicit join syntax, where it's technically necessary (see below) or where it makes the query text clearer.
The manual again:
FROM T1 CROSS JOIN T2
is equivalent toFROM T1 INNER JOIN T2 ON TRUE
(see below). It is also equivalent toFROM T1, T2
.
But "equivalent" does not mean identical. There is a subtle difference, as the manual notes:
Note
This latter equivalence does not hold exactly when more than two tables appear, becauseJOIN
binds more tightly than comma. For exampleFROM T1 CROSS JOIN T2 INNER JOIN T3 ON condition
is not the same asFROM T1, T2 INNER JOIN T3 ON condition
because thecondition
can referenceT1
in the first case but not the second.
This related question demonstrates the relevance of the difference:
- “invalid reference to FROM-clause entry for table” in Postgres query
Basically, your observation is exactly right:
it seems to me that it's declaring a variable called query so it can use it multiple times.
Any function can be used as "table function" in the FROM
list.
And function parameters can reference columns from all tables to the left of the function, because the notation:
FROM apod, to_tsquery('neutrino|(dark & matter)') query
is really equivalent to:
FROM apod CROSS JOIN LATERAL to_tsquery('neutrino|(dark & matter)') AS query
The manual on LATERAL queries:
Table functions appearing in
FROM
can also be preceded by the key wordLATERAL
, but for functions the key word is optional; the function's arguments can contain references to columns provided by preceding FROM items in any case.
Bold emphasis mine.
The keyword AS
is completely optional noise before table aliases (as opposed to column aliases, where it's recommended not to omit AS
to avoid possible ambiguities). Related answer with more:
- Date column arithmetic in PostgreSQL query
It creates an implicit CROSS JOIN
. It's the SQL-89 syntax.
Here I use values(1)
and values(2)
to create pseduo-tables (value tables) merely for examples. The thing after them t(x)
, and g(y)
are called FROM-Aliases the character inside the parenthesis is the alias for the column (x
and y
respectively). You could just as easily create a table to test this.
SELECT *
FROM (values(1)) AS t(x), (values(2)) AS g(y)
Here is how you'd write it now.
SELECT *
FROM (values(1)) AS t(x)
CROSS JOIN (values(2)) AS g(y);
From there you can make this an implicit INNER JOIN
by adding a conditional.
SELECT *
FROM (values(1)) AS t(x)
CROSS JOIN (values(1)) AS g(z)
WHERE x = z;
Or the explicit and newer INNER JOIN
syntax,
SELECT *
FROM (values(1)) AS t(x)
INNER JOIN (values(1)) AS g(z)
ON ( x = z );
So in your example..
FROM apod, to_tsquery('neutrino|(dark & matter)') query
This is essentially the same as the newer syntax,
FROM apod
CROSS JOIN to_tsquery('neutrino|(dark & matter)') AS query
which is actually the same, in this case, because to_tsquery()
returns a row and not a set as,
SELECT title, ts_rank_cd(
textsearch,
to_tsquery('neutrino|(dark & matter)')
) AS rank
FROM apod
WHERE to_tsquery('neutrino|(dark & matter)') @@ textsearch
ORDER BY rank DESC
LIMIT 10;
However, the above could potentially cause to_tsquery('neutrino|(dark & matter)')
to occur twice, but in this case it doesn't -- to_tsquery
is marked as STABLE (verified with \dfS+ to_tsquery
).
STABLE
indicates that the function cannot modify the database, and that within a single table scan it will consistently return the same result for the same argument values, but that its result could change across SQL statements. This is the appropriate selection for functions whose results depend on database lookups, parameter variables (such as the current time zone), etc. (It is inappropriate for AFTER triggers that wish to query rows modified by the current command.) Also note that the current_timestamp family of functions qualify as stable, since their values do not change within a transaction.
For a more complete comparison of the differences between SQL-89, and SQL-92, see also my answer here