Query to get queue position for each group

This is a Gaps and Islands question. See here for more details on problems like this.

This should do what you need:

-- Generate test data
DECLARE @Companies TABLE
(
    ID INT
    ,Company NVARCHAR(100)
    ,Location NVARCHAR(100)
);

INSERT @Companies
SELECT *
FROM    (VALUES (1, 'acme', 'new york')
                ,(2, 'acme', 'philadelphia')
                ,(3, 'genco', 'st.louis')
                ,(4, 'genco', 'san diego')
                ,(5, 'genco', 'san francisco')
                ,(6, 'acme', 'miami')
        ) AS CompanyLocations(ID, Company, Location);

-- Find company positions
;WITH cte_Companies
AS
(
    SELECT ID
           ,Company
           ,CASE 
              WHEN LAG(Company) OVER(ORDER BY ID) = Company  
              THEN 1
              ELSE 0
            END AS CompanyPosition
    FROM @Companies
)

SELECT ID, Company
FROM cte_Companies
WHERE CompanyPosition = 0

UPDATE Andriy noted that my solution was a SQL Servre 2012+ solution. The following code should work for versions down to 2005.

-- Generate test data
DECLARE @Companies TABLE
(
    ID INT
    ,Company NVARCHAR(100)
    ,Location NVARCHAR(100)
);

INSERT @Companies
SELECT *
FROM    (VALUES (1, 'acme', 'new york')
                ,(2, 'acme', 'philadelphia')
                ,(3, 'genco', 'st.louis')
                ,(4, 'genco', 'san diego')
                ,(5, 'genco', 'san francisco')
                ,(6, 'acme', 'miami')
                -- Further test data
                ,(7, 'genco', 'London')
                ,(8, 'genco', 'Portsmouth')
        ) AS CompanyLocations(ID, Company, Location);

-- Find company positions

SELECT ID, Company
FROM @Companies c1
WHERE NOT EXISTS    (
                        SELECT *
                        FROM @Companies c2
                        WHERE c1.Company = c2.Company
                        AND c1.ID - 1 = c2.ID
                    )

There is more than one option for you to choose from.

Apart from the LAG replacement method suggested in James Anderson's answer, you could for instance, employ the classic double ranking method of determining islands:

WITH
  partitioned AS
  (
    SELECT
      id,
      company,
      grp     = ROW_NUMBER() OVER (                     ORDER BY id ASC)
              - ROW_NUMBER() OVER (PARTITION BY company ORDER BY id ASC)
    FROM
      dbo.atable
  )
SELECT
  id      = MIN(id),
  company
FROM
  partitioned
GROUP BY
  company,
  grp
;

The grp column helps you to distinguish between different islands of rows that have the same company value (like, for instance, between the acme rows before genco and those aftergenco` in your data sample). It is used as an additional grouping criterion in the main query.

If the id values are already row numbers essentially (i.e. they have no gaps or duplicates), you can use id instead of the first ROW_NUMBER call in the above query:

WITH
  partitioned AS
  (
    SELECT
      id,
      company,
      grp     = id - ROW_NUMBER() OVER (PARTITION BY company ORDER BY id ASC)
    FROM
      dbo.atable
  )
SELECT
  id      = MIN(id),
  company
FROM
  partitioned
GROUP BY
  company,
  grp
;

You could also have an alternative substitution for the missing LAG functionality, which uses OUTER APPLY:

SELECT
  id,
  company
FROM
  dbo.atable AS t
  OUTER APPLY
    (
      SELECT TOP 1
        company
      FROM
        dbo.atable AS t2
      WHERE
        t2.id < t.id
      ORDER BY
        t2.id DESC
    ) AS o
WHERE
  o.company <> t.company
  OR o.company IS NULL
;

For each row the OUTER APPLY subquery returns company from the preceding row (or NULL if it is the first row). The WHERE clause simply filters out the rows where the match has a different name or is a NULL.

Query to get queue position for each group

Tags:

Sql Server

Group By

Sql Server 2008 R2

Gaps And Islands

T Sql

Related

Recent Posts