PostgreSQL joining using JSONB
This would be more efficient:
With json
and json_array_elements()
in pg 9.3
SELECT p.id AS p_id, p.data AS p_data
, c.id AS c_id, c.data AS c_data
FROM test p
LEFT JOIN LATERAL json_array_elements(p.data->'children') pc(child) ON TRUE
LEFT JOIN test c ON c.id = pc.child::text::int;
Use the
->
operator instead of->>
in the reference tochildren
. The way you have it, you'd first castjson
/jsonb
totext
and then back tojson
.The clean way to call a set-returning function is
LEFT [OUTER] JOIN LATERAL
. This includes rows without children. To exclude those, change to a[INNER] JOIN LATERAL
orCROSS JOIN
- or the shorthand syntax with a comma:, json_array_elements(p.data->'children') pc(child)
Avoiding duplicate column names in result.
SQL Fiddle.
With jsonb
and jsonb_array_elements()
in pg 9.4
EXPLAIN
SELECT p.id AS p_id, p.data AS p_data
, c.id AS c_id, c.data AS c_data
FROM test p
LEFT JOIN LATERAL jsonb_array_elements(p.data->'children') pc(child) ON TRUE
LEFT JOIN test c ON c.id = pc.child::text::int;
-------------------------------------------------------------------------------------------
Hash Left Join (cost=37.69..4826.24 rows=123000 width=72)
Hash Cond: (((pc.child)::text)::integer = c.id)
-> Nested Loop Left Join (cost=0.01..2482.31 rows=123000 width=68)
-> Seq Scan on test p (cost=0.00..22.30 rows=1230 width=36)
-> Function Scan on jsonb_array_elements pc (cost=0.01..1.01 rows=100 width=32)
-> Hash (cost=22.30..22.30 rows=1230 width=36)
-> Seq Scan on test c (cost=0.00..22.30 rows=1230 width=36)
Aside: A normalized DB design with basic data types would be way more efficient for this.
Nevermind, I found the way
SELECT *
FROM ( SELECT *, json_array_elements((data->>'children')::JSON) child FROM test) x1
LEFT JOIN test x2
ON x1.child::TEXT::INT = x2.id
;
id | data | child | id | data
----+--------------------------------------+-------+----+-----------------------------------
1 | {"parent": null, "children": [2, 3]} | 2 | 2 | {"parent": 1, "children": [4, 5]}
1 | {"parent": null, "children": [2, 3]} | 3 | 3 | {"parent": 1, "children": []}
2 | {"parent": 1, "children": [4, 5]} | 4 | 4 | {"parent": 2, "children": []}
2 | {"parent": 1, "children": [4, 5]} | 5 | 5 | {"parent": 2, "children": []}
QUERY PLAN
-----------------------------------------------------------------------------------------------------------
Hash Left Join (cost=37.67..4217.38 rows=123000 width=104)
Hash Cond: ((((json_array_elements(((test.data ->> 'children'::text))::json)))::text)::integer = x2.id)
-> Seq Scan on test (cost=0.00..643.45 rows=123000 width=36)
-> Hash (cost=22.30..22.30 rows=1230 width=36)
-> Seq Scan on test x2 (cost=0.00..22.30 rows=1230 width=36)
or
SELECT *
FROM test x1
LEFT JOIN ( SELECT *, json_array_elements((data->>'children')::JSON) child FROM test) x2
ON x1.id = x2.child::TEXT::INT
;
id | data | id | data | child
----+--------------------------------------+----+--------------------------------------+-------
2 | {"parent": 1, "children": [4, 5]} | 1 | {"parent": null, "children": [2, 3]} | 2
3 | {"parent": 1, "children": []} | 1 | {"parent": null, "children": [2, 3]} | 3
4 | {"parent": 2, "children": []} | 2 | {"parent": 1, "children": [4, 5]} | 4
5 | {"parent": 2, "children": []} | 2 | {"parent": 1, "children": [4, 5]} | 5
1 | {"parent": null, "children": [2, 3]} | | |
QUERY PLAN
-----------------------------------------------------------------------------------------------------------
Hash Right Join (cost=37.67..4217.38 rows=123000 width=104)
Hash Cond: ((((json_array_elements(((test.data ->> 'children'::text))::json)))::text)::integer = x1.id)
-> Seq Scan on test (cost=0.00..643.45 rows=123000 width=36)
-> Hash (cost=22.30..22.30 rows=1230 width=36)
-> Seq Scan on test x1 (cost=0.00..22.30 rows=1230 width=36)