A better way to write this query?

While your query is valid, there is a lot I would differently.

Don't use CaMeL-case names in Postgres if it can be avoided. Your unnamed entity framework may force this nonsense upon you, but I don't want to deal with the double-quote mess, so I tested with your schema after removing all double-quotes - effectively making all identifiers lower-cased.
- Are PostgreSQL column names case-sensitive?
Don't use illegible or illegal column and table aliases (like "wordTexts.WordId"). That's a matter of taste and style (and sanity), but you also omitted the keyword AS where you should not and kept it where you could omit it.
- Date column arithmetic in PostgreSQL query
I also formatted some more to make it easier for me to grok the query. That last part is totally optional. But use some consistent formatting style.

Arriving at this:

SELECT *
FROM   PulledTexts
WHERE  Id IN (
   SELECT w.TextIdId
   FROM   WordTexts w  -- AS can be omitted for table alias
   LEFT   JOIN UniqueWords u ON w.WordIdId = u.Id  -- LEFT JOIN might be necessary here
   WHERE  u.WordText = 'automate'
   OR     w.TextIdId IN (
      SELECT w.TextIdId  -- AS and1 -- column alias only documentation here, not visible
      FROM   WordTexts w
      JOIN   UniqueWords u ON w.WordIdId = u.Id  -- LEFT JOIN misleading here
      WHERE  u.WordText = 'audit'

      INTERSECT
      SELECT w.TextIdId  -- AS and2  -- but don't omit AS for column alias
      FROM   WordTexts w
      JOIN   UniqueWords u ON w.WordIdId = u.Id
      WHERE  u.WordText = 'trial'
      )
   );

Which can be simplified to:

SELECT *
FROM  (
   SELECT w.TextIdId AS Id
   FROM   WordTexts   w
   JOIN   UniqueWords u ON w.WordIdId = u.Id  -- now we don't need LEFT any more
   WHERE  u.WordText = 'automate'

   UNION
   SELECT w.TextIdId
   FROM   WordTexts w
   JOIN   UniqueWords u ON w.WordIdId = u.Id
   WHERE  u.WordText = 'audit'

   INTERSECT
   SELECT w.TextIdId
   FROM   WordTexts w
   JOIN   UniqueWords u ON w.WordIdId = u.Id
   WHERE  u.WordText = 'trial'
   ) w
JOIN   PulledTexts p USING (Id)

We do not need additional parentheses because, according to the manual:

INTERSECT binds more tightly than UNION. That is, A UNION B INTERSECT C will be read as A UNION (B INTERSECT C).

But this will be faster when replacing multiple intersected subqueries:

SELECT *
FROM  (
   SELECT w.TextIdId AS Id
   FROM   WordTexts   w
   JOIN   UniqueWords u ON w.WordIdId = u.Id
   WHERE  u.WordText = 'automate'

   UNION
   SELECT TextIdId
   FROM   WordTexts w1
   JOIN   WordTexts w2 USING (TextIdId)
   WHERE  w1.WordIdId = (SELECT Id FROM UniqueWords WHERE WordText = 'audit')
   AND    w2.WordIdId = (SELECT Id FROM UniqueWords WHERE WordText = 'trial')
   ) w
JOIN   PulledTexts p USING (Id)

The INTERSECT part can be cast as relational division problem. Explanation in this related answer from just yesterday:

Counting with multiple joins

db<>fiddle here

Most important for performance is to have the right indexes. You probably should have a UNIQUE constraint on (WordIdId, TextIdId) in table WordTexts, which implements the currently missing index on those two columns in this order.

Does creating a unique constraint on a Postgres column remove the need to index it?

A better way to write this query?

Tags:

Performance

Postgresql

Subquery

Relational Division

Postgresql Performance

Related

Recent Posts