Copy large table to archive table SQL Server 2012
You can use the following Stored Procedure, scheduled via a SQL Server Agent job, to archive records on a continual basis. You don't need to worry about the initial move of "everything older than 6 months" as it will be taken care of naturally by this process since it will be able to archive records faster than you are inserting them. Of course, if you want to get a large chunk of those rows archived immediately, you can run this Stored Procedure manually (while the job is not running), and just pass in parameter values that are slightly higher for @BatchSize
and @SecondsToRun
.
Important Notes:
- Scheduled this to run via SQL Server Agent, probably once per hour
- Set defaults for
@BatchSize
and@SecondsToRun
to match how fast data is coming into the table - Create a NonClustered Index on
dbo.AudLog ([XDATE] ASC)
- Assuming that
XLOGNO
is in the mainAudLog
table, remove the IDENTITY specification from theXLOGNO
column here (inAudLog_Backup
) as that value comes from the mainAudLog
table - Change the datatype of the
DETAILS
column to beVARCHAR(MAX)
:text
has been deprecated since the release of SQL Server 2005 and is more difficult to work with.
Minor notes:
- Running this Stored Procedure, at least via the SQL Server Agent job if not also manually at first to get a large chuck of rows moved over, should be less work than swapping the table out, and there is no period of time when the
AudLog
table is empty. - The Stored Procedure will simply exit if there is nothing to archive, so it doesn't hurt to run once per hour, or maybe every 2 - 3 hours (depending on how fast new rows come in).
- The Stored Procedure is throttled to exit after a certain number of seconds so that if it is causing contention on the main
AudLog
table, it will only be for short periods of time. - The
OUTPUT
clause binds theDELETE
from the mainAudLog
table and theINSERT
into theAudLog_Backup
table in a Transaction, so you don't need to worry about the process failing, or system shutting down / crashing, between those two operations.
CREATE PROCEDURE dbo.AudLog_Archive
(
@BatchSize INT = 2000, -- don't go above 3000 to avoid lock escalation
@SecondsToRun INT = 180, -- run for 3 minutes
@DaysToKeep INT = 185 -- archive rows older than 6 months
)
AS
SET NOCOUNT ON;
DECLARE @EndTime DATETIME = DATEADD(SECOND, @SecondsToRun, GETDATE()),
@ArchiveDate DATETIME = DATEADD(DAY, @DaysToKeep * -1, GETDATE()),
@RowsArchived INT = @BatchSize; -- initialize to be able to enter the loop
WHILE (@EndTime > GETDATE() AND @RowsArchived = @BatchSize)
BEGIN
IF (EXISTS(
SELECT 1
FROM dbo.AudLog al
WHERE al.[XDATE] < @ArchiveDate
)
)
BEGIN
;WITH batch AS
(
-- Keep this as SELECT * as it will alert you, via job failure, if
-- you add columns to AudLog but forget to add them to AudLog_Backup
SELECT TOP (@BatchSize) al.*
FROM dbo.AudLog al
WHERE al.[XDATE] < @ArchiveDate
ORDER BY al.[XDATE] ASC
)
DELETE b
OUTPUT DELETED.* -- keep as * for same reason as noted above
INTO dbo.AudLog_Backup ([PVKEY], [DKEY], ...) -- specify all columns
FROM batch b
SET @RowsArchived = @@ROWCOUNT;
WAITFOR DELAY '00:00:01.000'; -- one second delay for breathing room
END;
ELSE
BEGIN
BREAK;
END;
END;
Two approaches that I have seen used.
Approach 1
Create an identical table to audlog, say audlog_new. Adjust Identity seed if necessary to be greater than the largest ID on your current audlog table.
Rename your existing table to something else (audlog_old) and then rename audlog_new to audlog. Now you have all the time in the word to process your old data. You should be able to copy the last 6 months of data from what is now old table to the new one. You can do it in batches.
Approach 2
Similar to yours but do it in batches of say 10,000 records. Use
SELECT MIN( ID ) AS FirstID, MAX( ID ) As LastID
FROM
( SELECT TOP 10000 FROM audlog WHERE XDATE <= DATEADD(day,-185,GETDATE())
ORDER BY ID ) AS a
INSERT audlog_new
SELECT *
FROM audlog
WHERE ID BETWEEN FirstID AND LastID
DELETE FROM audlog WHERE ID BETWEEN FirstID AND LastID