How to use SELECT DISTINCT with RANDOM() function in PostgreSQL?
You can simplify your query to avoid the problem a priori:
SELECT p.cod, p.title
FROM product p
WHERE p.visible
AND EXISTS (
SELECT 1
FROM product_filter pf
JOIN filters f ON f.cod = pf.cod_filter
WHERE pf.cod_product = p.cod
)
ORDER BY random()
LIMIT 4;
Major points:
You have only columns from table
product
in the result, other tables are only checked for existence of a matching row. For a case like this theEXISTS
semi-join is likely the fastest and simplest solution. Using it does not multiply rows from the base tableproduct
, so you don't need to remove them again withDISTINCT
.LIMIT
has to come last, afterORDER BY
.I simplified WHERE
p.visible = 't'
top.visible
, because this should be a boolean column.
You either do a subquery
SELECT * FROM (
SELECT DISTINCT p.cod, p.title ... JOIN... WHERE
) ORDER BY RANDOM() LIMIT 4;
or you try GROUPing for those same fields:
SELECT p.cod, p.title, MIN(RANDOM()) AS o FROM ... JOIN ...
WHERE ... GROUP BY p.cod, p.title ORDER BY o LIMIT 4;
Which of the two expressions will evaluate faster depends on table structure and indexing; with proper indexing on cod and title, the subquery version will run faster (cod and title will be taken from index cardinality information, and cod is the only key needed for the JOIN, so if you index by title, cod and visible (used in the WHERE), it is likely that the physical table will not even be accessed at all.
I am not so sure whether this would happen with the second expression too.
Use a subquery. Don't forget the table alias, t
. LIMIT
comes after ORDER BY
.
SELECT *
FROM (SELECT DISTINCT a, b, c
FROM datatable WHERE a = 'hello'
) t
ORDER BY random()
LIMIT 10;