Query to get queue position for each group
This is a Gaps and Islands question. See here for more details on problems like this.
This should do what you need:
-- Generate test data
DECLARE @Companies TABLE
(
ID INT
,Company NVARCHAR(100)
,Location NVARCHAR(100)
);
INSERT @Companies
SELECT *
FROM (VALUES (1, 'acme', 'new york')
,(2, 'acme', 'philadelphia')
,(3, 'genco', 'st.louis')
,(4, 'genco', 'san diego')
,(5, 'genco', 'san francisco')
,(6, 'acme', 'miami')
) AS CompanyLocations(ID, Company, Location);
-- Find company positions
;WITH cte_Companies
AS
(
SELECT ID
,Company
,CASE
WHEN LAG(Company) OVER(ORDER BY ID) = Company
THEN 1
ELSE 0
END AS CompanyPosition
FROM @Companies
)
SELECT ID, Company
FROM cte_Companies
WHERE CompanyPosition = 0
UPDATE Andriy noted that my solution was a SQL Servre 2012+ solution. The following code should work for versions down to 2005.
-- Generate test data
DECLARE @Companies TABLE
(
ID INT
,Company NVARCHAR(100)
,Location NVARCHAR(100)
);
INSERT @Companies
SELECT *
FROM (VALUES (1, 'acme', 'new york')
,(2, 'acme', 'philadelphia')
,(3, 'genco', 'st.louis')
,(4, 'genco', 'san diego')
,(5, 'genco', 'san francisco')
,(6, 'acme', 'miami')
-- Further test data
,(7, 'genco', 'London')
,(8, 'genco', 'Portsmouth')
) AS CompanyLocations(ID, Company, Location);
-- Find company positions
SELECT ID, Company
FROM @Companies c1
WHERE NOT EXISTS (
SELECT *
FROM @Companies c2
WHERE c1.Company = c2.Company
AND c1.ID - 1 = c2.ID
)
There is more than one option for you to choose from.
Apart from the LAG replacement method suggested in James Anderson's answer, you could for instance, employ the classic double ranking method of determining islands:
WITH
partitioned AS
(
SELECT
id,
company,
grp = ROW_NUMBER() OVER ( ORDER BY id ASC)
- ROW_NUMBER() OVER (PARTITION BY company ORDER BY id ASC)
FROM
dbo.atable
)
SELECT
id = MIN(id),
company
FROM
partitioned
GROUP BY
company,
grp
;
The grp
column helps you to distinguish between different islands of rows that have the same company
value (like, for instance, between the acme
rows before genco
and those after
genco` in your data sample). It is used as an additional grouping criterion in the main query.
If the id
values are already row numbers essentially (i.e. they have no gaps or duplicates), you can use id
instead of the first ROW_NUMBER call in the above query:
WITH
partitioned AS
(
SELECT
id,
company,
grp = id - ROW_NUMBER() OVER (PARTITION BY company ORDER BY id ASC)
FROM
dbo.atable
)
SELECT
id = MIN(id),
company
FROM
partitioned
GROUP BY
company,
grp
;
You could also have an alternative substitution for the missing LAG functionality, which uses OUTER APPLY:
SELECT
id,
company
FROM
dbo.atable AS t
OUTER APPLY
(
SELECT TOP 1
company
FROM
dbo.atable AS t2
WHERE
t2.id < t.id
ORDER BY
t2.id DESC
) AS o
WHERE
o.company <> t.company
OR o.company IS NULL
;
For each row the OUTER APPLY subquery returns company
from the preceding row (or NULL if it is the first row). The WHERE clause simply filters out the rows where the match has a different name or is a NULL.