Expanding a dataset based on a weight field
An efficient way to perform this task uses an auxiliary table of numbers. This is simply a table with integers from 1 to n, where 'n' is perhaps a million or so. A numbers table comes in handy for all sorts of regular tasks.
CREATE TABLE dbo.Demo
(
SampleNumber integer IDENTITY NOT NULL,
SampleWeight integer NOT NULL,
Attribute1 integer NOT NULL,
Attribute2 integer NOT NULL,
Attribute3 integer NOT NULL,
CONSTRAINT [PK dbo.Demo SampleNumber]
PRIMARY KEY (SampleNumber),
CONSTRAINT [CK dbo.Demo SampleWeight 1-50]
CHECK (SampleWeight BETWEEN 1 AND 50)
);
INSERT INTO dbo.Demo
(
SampleWeight,
Attribute1,
Attribute2,
Attribute3
)
VALUES
(2, 23, 32, 65),
(1, 32, 56, 75),
(3, 54, 25, 87);
SELECT
D.SampleNumber,
D.Attribute1,
D.Attribute2,
D.Attribute3
FROM dbo.Demo AS D
JOIN dbo.Numbers AS N
ON N.n BETWEEN 1 AND D.SampleWeight;
DROP TABLE dbo.Demo;
Execution plan:
Output:
╔══════════════╦════════════╦════════════╦════════════╗
║ SampleNumber ║ Attribute1 ║ Attribute2 ║ Attribute3 ║
╠══════════════╬════════════╬════════════╬════════════╣
║ 1 ║ 23 ║ 32 ║ 65 ║
║ 1 ║ 23 ║ 32 ║ 65 ║
║ 2 ║ 32 ║ 56 ║ 75 ║
║ 3 ║ 54 ║ 25 ║ 87 ║
║ 3 ║ 54 ║ 25 ║ 87 ║
║ 3 ║ 54 ║ 25 ║ 87 ║
╚══════════════╩════════════╩════════════╩════════════╝
SQLfiddle here.
You can use CTE to do it.
here is a sql fiddle for it : http://sqlfiddle.com/#!3/0b172/8
create table t(sampn int, weight int, attrib1 int, attrib2 int, attrib3 int);
insert into t values (1 , 2 , 23 , 32 , 65)
insert into t values (2 , 1 , 32 , 56 , 75)
insert into t values (3 , 3 , 54 , 25 , 87)
-- solution using cte.
;WITH cte (
sampn
,[weight]
,attrib1
,attrib2
,attrib3
,RepeatedTime
)
AS (
SELECT sampn
,[weight]
,attrib1
,attrib2
,attrib3
,1
FROM t
WHERE [weight] <> 0 -- this will take care if the weight is ZERO
UNION ALL
SELECT sampn
,[weight]
,attrib1
,attrib2
,attrib3
,RepeatedTime + 1
FROM cte a
WHERE a.[weight] > a.RepeatedTime
)
SELECT sampn
,attrib1
,attrib2
,attrib3
FROM cte
ORDER BY sampn
OPTION (MAXRECURSION 100) -- change this if you have more recursions
I agree with Paul that the numbers table is probably the best solution that you are going to come up with in terms of manageability and performance. With that said, you could approach this problem from a different angle, using XML.
This solution is fairly easy to understand. Performance on a small number of rows should be fine, but for a large number of rows, you will have to test.
Update: I have added the exact code that I am using that is working, in my test environment.
USE tempdb
GO
IF OBJECT_ID('tempdb.dbo.Demo') IS NOT NULL
BEGIN
DROP TABLE dbo.Demo;
END
GO
CREATE TABLE dbo.Demo
(
SampleNumber integer IDENTITY NOT NULL,
SampleWeight integer NOT NULL,
Attribute1 integer NOT NULL,
Attribute2 integer NOT NULL,
Attribute3 integer NOT NULL,
CONSTRAINT [PK dbo.Demo SampleNumber]
PRIMARY KEY (SampleNumber),
CONSTRAINT [CK dbo.Demo SampleWeight 1-50]
CHECK (SampleWeight BETWEEN 1 AND 50)
);
INSERT INTO dbo.Demo
(
SampleWeight,
Attribute1,
Attribute2,
Attribute3
)
VALUES
(2, 23, 32, 65),
(1, 32, 56, 75),
(3, 54, 25, 87);
GO
SELECT
y.SampleNumber,
Att.vals.value('@Attribute1','int'),
Att.vals.value('@Attribute2','int'),
Att.vals.value('@Attribute3','int')
FROM(
SELECT
SampleNumber,
Attribute=(SELECT CAST(REPLICATE(CAST(Attribute.Data AS VARCHAR(MAX)),SampleWeight) AS XML))
FROM dbo.Demo
CROSS APPLY(
SELECT
[@Attribute1] = Attribute1,
[@Attribute2] = Attribute2,
[@Attribute3] = Attribute3
FOR XML PATH('data'),ROOT('root'), TYPE
) AS Attribute (Data)
) AS y
CROSS APPLY y.Attribute.nodes('/root/data') as Att(vals)