Trouble wrapping head around complex SQL delete query
You want to know all registrations that need to be kept. So your first query returns registrations within 5 previous years :
SELECT
Id
FROM
Registration
WHERE
Registration_date >= '2011-10-08'
then all registrations with clients related to the previous query :
SELECT
a2.Registration_id as Id
FROM
AssociatedClient AS a1
INNER JOIN AssociatedClient AS a2
ON a1.Client_id = a2.Client_id
WHERE
a1.Registration_id IN
(
SELECT
Id
FROM
Registration
WHERE
Registration_date >= '2011-10-08'
)
Then you have all registrations that you must not delete by combining the previous queries in an UNION
, and you want all clients that are not part of this query :
SELECT
Client_id
FROM
AssociatedClient
WHERE
Registration_id NOT IN
(
SELECT
Id
FROM
Registration
WHERE
Registration_date >= '2011-10-08'
UNION
SELECT
a2.Registration_id as Id
FROM
AssociatedClient AS a1
INNER JOIN AssociatedClient AS a2
ON a1.Client_id = a2.Client_id
WHERE
a1.Registration_id IN
(
SELECT
Id
FROM
Registration
WHERE
Registration_date >= '2011-10-08'
)
)
you can see the results in this SQL fiddle
Then you can delete the lines of clients without registration correspondig to the criterias using the following query :
DELETE FROM
AssociatedClient
WHERE
Client_id IN (<previous query>);
and all registrations not present in AssociatedClient :
DELETE FROM
Registration
WHERE
Id NOT IN (SELECT Registration_id FROM AssociatedClient)
Begin by identifying the registrations of the other clients of a registration. Here's a view:
create view groups as
select a.Client_id
, c.Registration_id
from AssociatedClient as a
join AssociatedClient as b on a.Registration_id = b.Registration_id
join AssociatedClient as c on b.Client_id = c.Client_id;
That gives us:
select Client_id
, min(Registration_id) as first
, max(Registration_id) as last
, count(distinct Registration_id) as regs
, count(*) as pals
from groups
group by Client_id;
Client_id first last regs pals
---------- ---------- ---------- ---------- ----------
2 2 8 4 5
3 2 8 4 18
4 5 5 1 1
5 2 8 4 5
7 10 10 1 1
8 9 9 1 1
You dont' need a view, of course; it's just for convenience. You could just use a virtual table. But inspect it carefully to convince yourself it produces the right range of "pal registrations" for each client. Note that the view does not reference Registration
. That's significant because it produces the same results even after we use it to delete from Registration
, so we can use it for the second delete statement.
Now we have a list of clients and their "pal registrations". What's the date of each pal's last registration?
select g.Client_id, max(Registration_date) as last_reg
from groups as g join Registration as r
on g.Registration_id = r.Id
group by g.Client_id;
g.Client_id last_reg
----------- ----------
2 2011-10-14
3 2011-10-14
4 2011-10-07
5 2011-10-14
7 2011-10-17
8 2011-10-14
Which ones have a latest date before a time certain?
select g.Client_id, max(Registration_date) as last_reg
from groups as g join Registration as r
on g.Registration_id = r.Id
group by g.Client_id
having max(Registration_date) < '2011-10-08';
g.Client_id last_reg
----------- ----------
4 2011-10-07
IIUC that would mean that client #4 should be deleted, and anything he registered for should be deleted. Registrations would be
select * from Registration
where Id in (
select Registration_id from groups as g
where Client_id in (
select g.Client_id
from groups as g join Registration as r
on g.Registration_id = r.Id
group by g.Client_id
having max(Registration_date) < '2011-10-08'
)
);
Id Registration_date
---------- -----------------
5 2011-10-07
And, sure enough, client #4 is in Registration #5, and is the only client subject to deletion by this test.
From there you can work out the delete
statements. I think the rule is "delete the client and anything he registered for". If so, I'd probably write the Registration IDs to a temporary table, and write the deletes for both Registration
and AssociatedClient
by joining to it.