Postgis - Zonal statistics - Parallel computation?
i would add a intersects join clause on this query like this
DROP TABLE IF EXISTS schema.zonal_stats;
CREATE TABLE schema.zonal_stats AS
SELECT id,geom,
(ST_SummaryStats(ST_Union(ST_Clip(raster,geom)))).sum
FROM schema.raster join schema.ply on st_intersects(raster,geom)
GROUP BY id,geom;
this will run more efficiently because it will use an index and only clip the raster sections that intersect the vector.
In terms of the parrallel processing I am not well versed in what happens under the hood but it seems like you followed what the docs and other tutorials have said about setting the parralel workers etc..
Based on your comments that you pre-clipped the raster to the vector I would not even use the geometry in the query
I would try
SELECT id,rast,
(ST_SummaryStats(rast)).sum
FROM schema.raster
or if you are hell bent on the st_union
SELECT id,rast,
(ST_SummaryStats(st_union(rast))).sum
FROM schema.raster
group by id,rast
This doesn't answer the question 100% but does answer my problem, which is to speed up zonal statistics.
I am using the following exactextract command line tool for fast zonal statistics. This also takes into account partial overlaps, by weighting the zonal statistic for each raster value by its coverage fraction. Many zonal statistic tools do not account for this.
If someone can produce an example of multithreaded zonal statistics in postgis, I will substitute this as the marked answer with another.