Deleting repeating values in database
You can do this with some creative self-join logic.
Think of three hypothetical rows in the table.
- Row a you want to keep.
- Row b has the same product name and price, and a date 1 day after a. You want to delete this.
- Row c has the same product name and price, and a date 1 day after b. You want to keep this.
So if you can do a self-join to match these three rows, then delete row b.
DELETE b FROM MyTable AS a
JOIN MyTable AS b ON a.name=b.name AND a.price=b.price AND a.date=b.date + INTERVAL 1 DAY
JOIN MyTable AS c ON b.name=c.name AND b.price=c.price AND b.date=c.date + INTERVAL 1 DAY;
This works even if there are multiple rows that fit the conditions for row b. It'll delete the first one, and then continue to delete subsequent rows that also fit the conditions.
This works if you use the DATE
data type and store your dates as 'YYYY-MM-DD', not 'DD-MM-YYYY'. You should do this anyway.
You want to delete the rows where the product name and price are the same as the rows with the date plus/minus one day.
DELETE row_mid
FROM
record_table AS row_mid
JOIN record_table AS row_prev
JOIN record_table AS row_next
WHERE
row_mid.name = row_prev.name
AND row_mid.price = row_prev.price
AND row_mid.date = DATE_SUB(row_prev.date, INTERVAL 1 DAY)
AND row_mid.name = row_next.name
AND row_mid.price = row_next.price
AND row_mid.date = DATE_ADD(row_next.date, INTERVAL 1 DAY);
Is your MySQL new enough to support CTE? This is a pretty interesting problem I've seen with date scheduling. The code always looks awkward. To check the results without a delete you can switch the comment mark with the select and delete and comment out the t.[Name] is null line.
WITH
cte AS (
SELECT a.ID
, a.[Name]
, a.[Date]
, a.Price
, NextDate = max(npc.[Date]) -- Next Price change
, PrevDate = max(lpc.[Date]) -- Next Price change
FROM yourTable as a -- Base Table
LEFT JOIN
yourTable as npc -- Looking for Next Price Change
ON a.[Name] = npc.[Name]
and a.[Date] < npc.[Date]
and a.Price <> npc.Price
LEFT JOIN
yourTable as lpc -- Looking for Last Price Change
ON a.[Name] = lpc.[Name]
and a.[Date] > lpc.[Date]
and a.Price <> lpc.Price
GROUP BY a.ID, a.[Name], a.[Date], a.Price
)
----SELECT f.*, [Check] = CASE WHEN t.[Name] is null THEN 'DELETE' ELSE '' END
DELETE f
FROM
yourTable as f
LEFT JOIN
(
SELECT [Name], [GoodDate] = Max([Date])
FROM cte
GROUP BY [Name], PrevDate
UNION
SELECT [Name], [GoodDate] = Min([Date])
FROM cte
GROUP BY [Name], PrevDate
UNION
SELECT [Name], [GoodDate] = Max([Date])
FROM cte
GROUP BY [Name], NextDate
UNION
SELECT [Name], [GoodDate] = Min([Date])
FROM cte
GROUP BY [Name], NextDate
) as t
ON t.[Name] = f.[Name] and t.[GoodDate] = f.[Date]
WHERE t.[Name] is null
--ORDER BY f.[Name], f.[Date]