Prometheus query to count unique label values
count(count by (a) (hello_info))
First you want an aggregator with a result per value of a
, and then you can count them.
The count(count(hello_info) by (a))
is equivalent to the following SQL:
SELECT
time_bucket('5 minutes', timestamp) AS t,
COUNT(DISTINCT a)
FROM hello_info
GROUP BY t
See time_bucket() function description.
E.g. it returns the number of distinct values for a
label per each 5-minute
interval by default - see staleness docs for details about 5-minute interval.
If you need to calculate the number of unique values for a
label over custom interval (for example, over the last day), then the following PromQL query must be used instead:
count(count(last_over_time(hello_info[1d])) by (a))
The custom interval - 1d
in the case above - can be changed to an arbitrary value - see these docs for possible values, which can be used there.
This query uses last_over_time() function for selecting all the time series, which were active during the last day. Time series can stop receiving new samples and become inactive at any time. Such time series aren't captured with simple count(...) by (a)
after 5 minutes of inactivity. New deployments in Kubernetes and horizontal pod autoscaling are the most frequent source of big number of inactive time series (aka high churn rate).
Other example: If you want to count the number of apps deployed in a kubernetes cluster based on different values of a label( ex:app):
count(count(kube_pod_labels{app=~".*"}) by (app))