Clustered columnstore index on small tables
Columnstore index has a big advantage on compress data size. The general aim of the Columnstore index is quickly read a bunch of data due to its compression.
CCI is Columnstore Clustered Index, Clustered is Clustered Index
Columnstore Index compresses data size from 4MB to 2MB.
We can look at the performance in two tables and three parts.
The first one is the minimal SELECT
operation:
SELECT * FROM Users_CCI WITH(INDEX=CCI_Users) WHERE Id=12333
SELECT * FROM Users_Clustered WHERE Id=12333
Result: There is a Columnstore Scan
here and wrong estimated. Further, logical read difference. You can say it doesn't matter for you, but, If you use a minimal SELECT query, probably use thousands of times. And it will affect total performance.
The second one is the minimal UPDATE
operation:
UPDATE Users_CCI SET Age=10 WHERE Id=2
UPDATE Users_Clustered SET Age=10 WHERE Id=2
Result: There are read, CPU, and time differences as we can see.
The third one is the REBUILD
operation:
USE [StackOverflow2013]
GO
ALTER INDEX [CCI_Users] ON [dbo].[Users_CCI] REBUILD PARTITION = ALL WITH (DATA_COMPRESSION = COLUMNSTORE)
GO
USE [StackOverflow2013]
GO
ALTER INDEX [PK_Users_Clustered_Id] ON [dbo].[Users_Clustered] REBUILD PARTITION = ALL WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
GO
UPDATE Users_CCI SET Age=10
UPDATE Users_Clustered SET Age=10
If I rebuild and update all rows for index fragmentation, I will see a more fragmented Clustered Columnstore Index than the Clustered Index. And I didn't show, but the Clustered Columnstore Index
Rebuild process produces transaction log more than Clustered Index
.
Like the document says
- More than 10% of the operations on the table are updates and deletes. Large numbers of updates and deletes cause fragmentation.The fragmentation affects compression rates and query performance until you run an operation called reorganize that forces all data into the columnstore and removes fragmentation. For more information, see Minimizing index fragmentation in columnstore index.
If you have a small table you don't need to Columnstore index.
To @YunusUYANIK point for the potential downfalls of designing your schema catered to just one side, why not create both rowstore and columnstore indexes on your table catered appropriately to both scenarios? Sure you may end up indexing the same fields both ways, but the main drawback there would just be the increased use of storage space which generally is much less of a concern when planning for performance.
It will depend on your schema and the amount of data in your tables for each vendor, so you'll have to test to ensure your design of the indexes are being used in the appropriate queries for the different amounts of data based on your vendor predicates. In the worst case, you might have to use index hints sometimes too, but I feel that if you design both types of indexes correctly, that is not very likely.