Delete duplicate records from a SQL table without a primary key
Add a Primary Key (code below)
Run the correct delete (code below)
Consider WHY you woudln't want to keep that primary key.
Assuming MSSQL or compatible:
ALTER TABLE Employee ADD EmployeeID int identity(1,1) PRIMARY KEY;
WHILE EXISTS (SELECT COUNT(*) FROM Employee GROUP BY EmpID, EmpSSN HAVING COUNT(*) > 1)
BEGIN
DELETE FROM Employee WHERE EmployeeID IN
(
SELECT MIN(EmployeeID) as [DeleteID]
FROM Employee
GROUP BY EmpID, EmpSSN
HAVING COUNT(*) > 1
)
END
It is very simple. I tried in SQL Server 2008
DELETE SUB FROM
(SELECT ROW_NUMBER() OVER (PARTITION BY EmpId, EmpName, EmpSSN ORDER BY EmpId) cnt
FROM Employee) SUB
WHERE SUB.cnt > 1
Use the row number to differentiate between duplicate records. Keep the first row number for an EmpID/EmpSSN and delete the rest:
DELETE FROM Employee a
WHERE ROW_NUMBER() <> ( SELECT MIN( ROW_NUMBER() )
FROM Employee b
WHERE a.EmpID = b.EmpID
AND a.EmpSSN = b.EmpSSN )
With duplicates
As
(Select *, ROW_NUMBER() Over (PARTITION by EmpID,EmpSSN Order by EmpID,EmpSSN) as Duplicate From Employee)
delete From duplicates
Where Duplicate > 1 ;
This will update Table and remove all duplicates from the Table!