How do I get the size of a Vertica database?
There are several ways to get the size of a database, each suited to a slightly different use case. It's important to note that Vertica uses raw and compressed data in different ways, and that you should be conscious of which size you require. For example, licensing is based on the raw data size.
Raw Size
The raw size is useful for capacity planning or monitoring license utilization (it's the space the data would take up if it wasn't compressed). To get the raw data size of the entire database, you can either use the GET_COMPLIANCE_STATUS()
function, or query the system table v_internal.license_audits
.
The GET_COMPLIANCE_STATUS()
function retrieves information about the most recent audit. An audit estimates the raw data size of the database and stores the information in v_internal.license_audits
. By default, audits take place daily at 23:59 and can be configured or manually run.
Here's some example output:
dbadmin=> SELECT GET_COMPLIANCE_STATUS();
GET_COMPLIANCE_STATUS
---------------------------------------------------------------------------------
Raw Data Size: 4.83TB +/- 0.24TB
License Size : 30.00TB
Utilization : 16%
Audit Time : 2014-05-11 23:59:49.763799+00
Compliance Status : The database is in compliance with respect to raw data size.
License End Date: 10/30/2014
Days Remaining: 171.18
If you have permissions, you can directly query the license_audits
table:
SELECT /*+ label(license_utilization)*/
audit_start_timestamp,
database_size_bytes / ( 1024^3 ) AS database_size_gb,
license_size_bytes / ( 1024^3 ) AS license_size_gb,
usage_percent
FROM v_catalog.license_audits
ORDER BY audit_start_timestamp DESC
LIMIT 30;
Further, if you wish to get raw data size at the schema level, you can use this (from vertica.tips):
SELECT /*+ label(estimated_raw_size)*/
pj.anchor_table_schema,
pj.used_compressed_gb,
pj.used_compressed_gb * la.ratio AS raw_estimate_gb
FROM (SELECT ps.anchor_table_schema,
SUM(used_bytes) / ( 1024^3 ) AS used_compressed_gb
FROM v_catalog.projections p
JOIN v_monitor.projection_storage ps
ON ps.projection_id = p.projection_id
WHERE p.is_super_projection = 't'
GROUP BY ps.anchor_table_schema) pj
CROSS JOIN (SELECT (SELECT database_size_bytes
FROM v_catalog.license_audits
ORDER BY audit_start_timestamp DESC
LIMIT 1) / (SELECT SUM(used_bytes)
FROM V_MONITOR.projection_storage) AS ratio) la
ORDER BY pj.used_compressed_gb DESC;
Compressed Size
The compressed size is the actual size of the data on the disk. This is useful for estimating disk space usage as Vertica recommends that at least 40% of space is available at all times. You can get the compressed size from column_storage
or projection_storage
system tables.
Using projection_storage
will also return any empty tables (from vertica.tips):
SELECT /*+ label(compressed_table_size)*/
anchor_table_schema,
anchor_table_name,
SUM(used_bytes) / ( 1024^3 ) AS used_compressed_gb
FROM v_monitor.projection_storage
GROUP BY anchor_table_schema,
anchor_table_name
ORDER BY SUM(used_bytes) DESC;