How to return a page of results from SQL?
I'd recommend either using LINQ, or try to copy what it does. I've got an app where I use the LINQ Take and Skip methods to retrieve paged data. The code looks something like this:
MyDataContext db = new MyDataContext();
var results = db.Products
.Skip((pageNumber - 1) * pageSize)
.Take(pageSize);
Running SQL Server Profiler reveals that LINQ is converting this query into SQL similar to:
SELECT [ProductId], [Name], [Cost], and so on...
FROM (
SELECT [ProductId], [Name], [Cost], [ROW_NUMBER]
FROM (
SELECT ROW_NUMBER() OVER (ORDER BY [Name]) AS [ROW_NUMBER],
[ProductId], [Name], [Cost]
FROM [Products]
)
WHERE [ROW_NUMBER] BETWEEN 10 AND 20
)
ORDER BY [ROW_NUMBER]
In plain English:
1. Filter your rows and use the ROW_NUMBER function to add row numbers in the order you want.
2. Filter (1) to return only the row numbers you want on your page.
3. Sort (2) by the row number, which is the same as the order you wanted (in this case, by Name).
There are essentially two ways of doing pagination in the database (I'm assuming you're using SQL Server):
Using OFFSET
Others have explained how the ROW_NUMBER() OVER()
ranking function can be used to perform pages. It's worth mentioning that SQL Server 2012 finally included support for the SQL standard OFFSET .. FETCH
clause:
SELECT first_name, last_name, score
FROM players
ORDER BY score DESC
OFFSET 40 ROWS FETCH NEXT 10 ROWS ONLY
If you're using SQL Server 2012 and backwards-compatibility is not an issue, you should probably prefer this clause as it will be executed more optimally by SQL Server in corner cases.
Using the SEEK Method
There is an entirely different, much faster, but less known way to perform paging in SQL. This is often called the "seek method" as described in this blog post here.
SELECT TOP 10 first_name, last_name, score
FROM players
WHERE (score < @previousScore)
OR (score = @previousScore AND player_id < @previousPlayerId)
ORDER BY score DESC, player_id DESC
The @previousScore
and @previousPlayerId
values are the respective values of the last record from the previous page. This allows you to fetch the "next" page. If the ORDER BY
direction is ASC
, simply use >
instead.
With the above method, you cannot immediately jump to page 4 without having first fetched the previous 40 records. But often, you do not want to jump that far anyway. Instead, you get a much faster query that might be able to fetch data in constant time, depending on your indexing. Plus, your pages remain "stable", no matter if the underlying data changes (e.g. on page 1, while you're on page 4).
This is the best way to implement paging when lazy loading more data in web applications, for instance.
Note, the "seek method" is also called keyset paging.
On MS SQL Server 2005 and above, ROW_NUMBER() seems to work:
T-SQL: Paging with ROW_NUMBER()
DECLARE @PageNum AS INT;
DECLARE @PageSize AS INT;
SET @PageNum = 2;
SET @PageSize = 10;
WITH OrdersRN AS
(
SELECT ROW_NUMBER() OVER(ORDER BY OrderDate, OrderID) AS RowNum
,OrderID
,OrderDate
,CustomerID
,EmployeeID
FROM dbo.Orders
)
SELECT *
FROM OrdersRN
WHERE RowNum BETWEEN (@PageNum - 1) * @PageSize + 1
AND @PageNum * @PageSize
ORDER BY OrderDate
,OrderID;
LINQ combined with lambda expressions and anonymous classes in .Net 3.5 hugely simplifies this sort of thing.
Querying the database:
var customers = from c in db.customers
join p in db.purchases on c.CustomerID equals p.CustomerID
where p.purchases > 5
select c;
Number of records per page:
customers = customers.Skip(pageNum * pageSize).Take(pageSize);
Sorting by any column:
customers = customers.OrderBy(c => c.LastName);
Getting only selected fields from server:
var customers = from c in db.customers
join p in db.purchases on c.CustomerID equals p.CustomerID
where p.purchases > 5
select new
{
CustomerID = c.CustomerID,
FirstName = c.FirstName,
LastName = c.LastName
};
This creates a statically-typed anonymous class in which you can access its properties:
var firstCustomer = customer.First();
int id = firstCustomer.CustomerID;
Results from queries are lazy-loaded by default, so you aren't talking to the database until you actually need the data. LINQ in .Net also greatly simplifies updates by keeping a datacontext of any changes you have made, and only updating the fields which you change.