Generate and Insert 1 million rows into simple table

Itzik Ben-Gan uses following approach This is probably the fastest way he found and he's quite smart :-)

WITH
  L0   AS (SELECT c FROM (SELECT 1 UNION ALL SELECT 1) AS D(c)), -- 2^1
  L1   AS (SELECT 1 AS c FROM L0 AS A CROSS JOIN L0 AS B),       -- 2^2
  L2   AS (SELECT 1 AS c FROM L1 AS A CROSS JOIN L1 AS B),       -- 2^4
  L3   AS (SELECT 1 AS c FROM L2 AS A CROSS JOIN L2 AS B),       -- 2^8
  L4   AS (SELECT 1 AS c FROM L3 AS A CROSS JOIN L3 AS B),       -- 2^16
  L5   AS (SELECT 1 AS c FROM L4 AS A CROSS JOIN L4 AS B),       -- 2^32
  Nums AS (SELECT ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS k FROM L5)

select k as id , 'a_' + cast (k as varchar) as a, 'b_' + cast (k/2 as varchar) as b into t1
from nums
where k <= 1000000

A variation on dnoeth's answer:

WITH Ten(N) AS 
(
    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL 
    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL 
    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
)   
SELECT
    id = IDENTITY(int, 1, 1)
INTO dbo.T1
FROM Ten T10
CROSS JOIN Ten T100
CROSS JOIN Ten T1000
CROSS JOIN Ten T10000
CROSS JOIN Ten T100000
CROSS JOIN Ten T1000000;

ALTER TABLE dbo.T1
ADD a AS CONVERT(varchar(11), id);

ALTER TABLE dbo.T1
ADD b AS CONVERT(varchar(11), id / 2);

This avoids storing the values of a and b; their values will be calculated at runtime as necessary. This may be cheating slightly, but it does have advantages:

  • No storage space used for columns a and b
  • The id column is directly typed as integer (4 bytes uncompressed); whereas ROW_NUMBER returns bigint (8 bytes uncompressed).
  • The id column is assigned the identity property, so it is not updateable.

Alternatively, storing all columns in the table:

WITH Ten(N) AS 
(
    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL 
    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL 
    SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
)   
SELECT
    id = CONVERT(integer, ROW_NUMBER() OVER (ORDER BY T10.N)),
    a = CONVERT(varchar(11), ROW_NUMBER() OVER (ORDER BY T10.N)),
    b = CONVERT(varchar(11), ROW_NUMBER() OVER (ORDER BY T10.N) / 2)
INTO dbo.T1
FROM Ten T10
CROSS JOIN Ten T100
CROSS JOIN Ten T1000
CROSS JOIN Ten T10000
CROSS JOIN Ten T100000
CROSS JOIN Ten T1000000;

Note the conversion to integer on the id column, and the use of a specific length on the varchar types. See:

Bad habits to kick : declaring VARCHAR without (length) by Aaron Bertrand


Method 1: @dnoeth above, insert time: 1077ms - 1180ms (10 times testing)

Method 2: I try to insert using this method, insert time 989ms -> 1132ms
It's simple .

select t1.k as id , 'a_' + cast (t1.k as varchar) as a, 'b_' + cast (t1.k/2 as varchar) as b  into t1
from ( 
SELECT  ROW_NUMBER() OVER(ORDER BY a.object_id) as k 
from sys.all_columns, sys.all_columns a ) t1
where t1.k < 1000001

Method 3 : from Paul White's idea, 450ms

with x1 as (select top 1000 object_id from sys.all_columns )
SELECT  id = IDENTITY(int, 1, 1) into t1
from x1 a, x1 b
ALTER TABLE dbo.T1 ADD a AS 'a_' + CONVERT(varchar(20),  id);
ALTER TABLE dbo.T1 ADD b AS  'b_' + CONVERT(varchar(20),  id / 2);

Tags:

Sql Server