Iterate through table, perform calculation on each row
I'm not sure if the proof of concept example does what you want. In general, with SQL, you almost never need a FOR loop. While you can use a function, if you have PostgreSQL 9.3 or later, you can use a LATERAL
subquery to perform subqueries for each row.
For example, create 10,000 random 3D points with a random value
column:
CREATE TABLE points(
gid serial primary key,
geom geometry(PointZ),
value numeric
);
CREATE INDEX points_geom_gist ON points USING gist (geom);
INSERT INTO points(geom, value)
SELECT ST_SetSRID(ST_MakePoint(random()*1000, random()*1000, random()*100), 0), random()
FROM generate_series(1, 10000);
For each point, search for the 100 nearest points (except the point in question), and find the residual between the points' value
and the average of the 100 nearest:
SELECT p.gid, p.value - avg(l.value) residual
FROM points p,
LATERAL (
SELECT value
FROM points j
WHERE j.gid <> p.gid
ORDER BY p.geom <-> j.geom
LIMIT 100
) l
GROUP BY p.gid
ORDER BY p.gid;
Doing updates row-by-row in a loop is almost always a bad idea and will be extremely slow and won't scale. You should really find a way to avoid that.
After having said that:
All your function is doing is to change the value of the column value in memory - you are just modifying the contents of a variable. If you want to update the data you need an update
statement:
You need to use an UPDATE
inside the loop:
CREATE OR REPLACE FUNCTION LoopThroughTable()
RETURNS VOID
AS
$$
DECLARE
t_row the_table%rowtype;
BEGIN
FOR t_row in SELECT * FROM the_table LOOP
update the_table
set resid = 1.0
where pk_column = t_row.pk_column; --<<< !!! important !!!
END LOOP;
END;
$$
LANGUAGE plpgsql;
Note that you have to add a where
condition on the primary key to the update
statement otherwise you would update all rows for each iteration of the loop.
A slightly more efficient solution is to use a cursor, and then do the update using where current of
CREATE OR REPLACE FUNCTION LoopThroughTable()
RETURNS VOID
AS $$
DECLARE
t_curs cursor for
select * from the_table;
t_row the_table%rowtype;
BEGIN
FOR t_row in t_curs LOOP
update the_table
set resid = 1.0
where current of t_curs;
END LOOP;
END;
$$
LANGUAGE plpgsql;
So if I execute the UPDATE query after the loop has finished, will that commit the changes to the table?
No. The call to the function runs in the context of the calling transaction. So you need to commit
after running SELECT LoopThroughTable()
if you have disabled auto commit in your SQL client.
Note that the language name is an identifier, do not use single quotes around it. You should also avoid using keywords like row
as variable names.
Using dollar quoting (as I did) also makes writing the function body easier