How to group by time bucket in ClickHouse and fill missing data with nulls/0s
You can generate zero values using the "number" function. Then join your query and zero values using UNION ALL and already according to the obtained data we make a GROUP BY.
So, your query will look like:
SELECT SUM(metric),
time
FROM (
SELECT toStartOfQuarter(toDate(1514761200+number*30*24*3600)) time,
toUInt16(0) AS metric
FROM numbers(30)
UNION ALL
SELECT toStartOfQuarter(created_at) AS time,
metric
FROM mytable
WHERE created_at >= toDate(1514761200)
AND created_at >= toDateTime(1514761200)
AND created_at <= toDate(1546210800)
AND created_at <= toDateTime(1546210800)
)
GROUP BY time
ORDER BY time
note toUInt16(0) - zero values must be of the same type as metrics
From ClickHouse 19.14 you can use the WITH FILL
clause. It can fill quarters in this way:
WITH
(
SELECT toRelativeQuarterNum(toDate('1970-01-01'))
) AS init
SELECT
-- build the date from the relative quarter number
toDate('1970-01-01') + toIntervalQuarter(q - init) AS time,
metric
FROM
(
SELECT
toRelativeQuarterNum(created_at) AS q,
sum(rand()) AS metric
FROM
(
-- generate some dates and metrics values with gaps
SELECT toDate(arrayJoin(range(1514761200, 1546210800, ((60 * 60) * 24) * 180))) AS created_at
)
GROUP BY q
ORDER BY q ASC WITH FILL FROM toRelativeQuarterNum(toDate(1514761200)) TO toRelativeQuarterNum(toDate(1546210800)) STEP 1
)
┌───────time─┬─────metric─┐
│ 2018-01-01 │ 2950782089 │
│ 2018-04-01 │ 2972073797 │
│ 2018-07-01 │ 0 │
│ 2018-10-01 │ 179581958 │
└────────────┴────────────┘