Database size - MDF too large?
In your size estimates, have you taken into account the amount of space taken by indexes? Also if you have text fields that are set as multi-byte (N[VAR]CHAR
rather than [VAR]CHAR
) and the input files are UTF-8 or plain one-byte-per-character then that will push your storage requirements up by up to a factor of two. Furthermore remember that if you have a clustered key/index on a table the size of this affects all other indexes on the table because they include the clustered key value for every row (so to give an extreme example if a table has an NCHAR(10) key where an INT would do and that is your clustered key/index you are not only using an extra 16 bytes per row in the data pages you also waste 16 bytes per row in every other index on that table).
Also, some space will be allocated but unused, either because the DB engine has left some space allocated after deletes so that it can be used again quickly for new data in that table or because the pattern of inserts and deletes has left many pages only part full.
You can run:
SELECT o.name
, SUM(ps.reserved_page_count)/128.0 AS ReservedMB
, SUM(ps.used_page_count)/128.0 AS UsedMB
, SUM(ps.reserved_page_count-ps.used_page_count)/128.0 AS DiffMB
FROM sys.objects o
JOIN sys.dm_db_partition_stats ps ON o.object_id = ps.object_id
WHERE OBJECTPROPERTYEX(o.object_id, 'IsMSShipped') = 0
GROUP BY o.name
ORDER BY SUM(ps.reserved_page_count) DESC
to get a quick look at what tables are taking up space.
Also EXEC sp_spaceused
run within that DB will return two result sets. The first lists the total space allocated in the filesystem for the data files and how much of that is unallocated, the second lists how much of the allocated space is used for data pages, for index pages, or is currently unused.
sp_spaceused
will return the space used by a given object too, so you can loop this to build a table for analysis:
-- TEMP TABLES FOR ANALYSIS
CREATE TABLE #tTables (sName NVARCHAR(MAX), iRows BIGINT, iReservedKB BIGINT, iDataKB BIGINT, iIndexKB BIGINT, iUnusedKB BIGINT)
CREATE TABLE #tTmp (sName NVARCHAR(MAX), iRows BIGINT, sReservedKB NVARCHAR(MAX), sDataKB NVARCHAR(MAX), sIndexKB NVARCHAR(MAX), sUnusedKB NVARCHAR(MAX))
-- COLLECT SPACE USE PER TABLE
EXEC sp_msforeachtable 'INSERT #tTmp EXEC sp_spaceused [?];'
-- CONVERT NUMBER-AS-TEXT COLUMNS TO NUMBER TYPES FOR EASIER ANALYSIS
INSERT #tTables SELECT sName, iRows
, CAST(REPLACE(sReservedKB, ' KB', '') AS BIGINT)
, CAST(REPLACE(sDataKB , ' KB', '') AS BIGINT)
, CAST(REPLACE(sIndexKB , ' KB', '') AS BIGINT)
, CAST(REPLACE(sUnusedKB , ' KB', '') AS BIGINT)
FROM #tTmp
DROP TABLE #tTmp
-- DO SOME ANALYSIS
SELECT sName='TOTALS', iRows=SUM(iRows), iReservedKB=SUM(iReservedKB), iDataKB=SUM(iDataKB), iIndexKB=SUM(iIndexKB), iUnusedKB=SUM(iUnusedKB) FROM #tTables ORDER BY sName
SELECT * FROM #tTables ORDER BY iReservedKB DESC
-- CLEAN UP
DROP TABLE #tTables
The above code will output all the table sizes in one list, plus a single row for the totals. If needed you can use the various system views (like sys.objects
and sys.dm_db_partition_stats
used in the first query above, see http://technet.microsoft.com/en-us/library/ms177862.aspx for much more detail) to get more details such as the space used by each index.
There are three classes of unused space in a data file:
- That which is not allocated to anything (this shows in the first resultset from
sp_spaceused
with no object specified) - That which is allocated to an object (reserved) but not currently used (this shows in the "unused" count in
sp_spaceused
's output. - That locked in part-used pages (this will look to be used as everything is allocated in single page chunks, one page being 8,192 bytes long). This is harder to detect/calculate. It is due to a mix of two factors:
- Split pages. As data gets added you often end up with part empty pages (the storage engine could always normalise page contents, but this would be very inefficient), and as rows are deleted page contents are not automatically packed (again they could be, but the extra I/O load is generally far from worth it).
- The storage engine won't split a row over multiple pages (this along with the page size where the 8,192 byte-per-row limit comes from). If your rows are fixed size and take 1,100 bytes each then you are going to "waste" at least 492 bytes of each data block allocated to that table (7 rows take 7,700 bytes and an 8th won't fit so the remaining bytes won't be used). The wider the rows, the worse this may be. Tables/indexes with variable length rows (which are far more common than completely fixed length ones) generally fair better (but are less easy to calculate the matter for).
Another caveat here is large objects (TEXT
columns,[N]VARCHAR(MAX)
values above a certain size and so on) as they do get placed off-page, just taking 8 bytes in the main row data to hold a pointer to the data elsewhere) so can break the 8,192 bytes-per-row-limit.
tl;dr: Estimating expected database sizes can be a lot more involved than it is natural to initially assume.
Try running sp_spaceused
on your database. As an example it returns:
reserved data index_size unused
------------------ ------------------ ------------------ ------------------
6032 KB 2624 KB 1664 KB 1744 KB
To run it on the database just USE
the database then run sp_spaceused
.
If it still shows a great deal of unused space you can try the shrink again. Sometimes I do find that it takes multiple tries. Also sometimes I find it works best to shrink the individual file rather than the database as a whole. However what you may find is that you have 2.9Tb of data and another 4+Tb of indexes in which case the 7.5TB is pretty reasonable. If you want to get a feel for the amount of space (data & index) of each table then you can run sp_spaceused
at a table level as well. You can run it across all tables in the database by using the following command:
EXEC sp_msforeachtable 'EXEC sp_spaceused [?];'
Although fair warning sp_msforeachtable is undocumented, unsupported and has been known to miss tables. On the other hand I've had a fair amount of luck with it myself.
All of that being said your database SHOULD have a certain percentage of free space depending on your expected growth. Basically you want to make sure that you have space for anywhere from 6 months to a couple of years worth of growth. Also you will want to check your autogrowth
settings to make sure they are appropriate to your situation. Particularly given the size of your database you do NOT want to be using a % autogrowth
.