How to simulate a pivot table with BigQuery?
2020 update: fhoffa.x.pivot()
- https://towardsdatascience.com/easy-pivot-in-bigquery-one-step-5a1f13c6c710
Use conditional statements to organize the results of a query into rows and columns. In the example below, results from a search for most revised Wikipedia articles that start with the value 'Google' are organized into columns where the revision counts are displayed if they meet various criteria.
SELECT
page_title,
/* Populate these columns as True or False, depending on the condition */
IF(page_title CONTAINS 'search', INTEGER(total), 0) AS search,
IF(page_title CONTAINS 'Earth' OR page_title CONTAINS 'Maps', INTEGER(total), 0) AS geo,
FROM
/* Subselect to return top revised Wikipedia articles containing 'Google'
* followed by additional text.
*/
(SELECT
TOP(title, 5) as page_title,
COUNT(*) as total
FROM
[publicdata:samples.wikipedia]
WHERE
REGEXP_MATCH (title, r'^Google.+') AND wp_namespace = 0
);
Result:
+---------------+--------+------+
| page_title | search | geo |
+---------------+--------+------+
| Google search | 4261 | 0 |
| Google Earth | 0 | 3874 |
| Google Chrome | 0 | 0 |
| Google Maps | 0 | 2617 |
| Google bomb | 0 | 0 |
+---------------+--------+------+
A similar example, without using a subquery:
SELECT SensorType, DATE(DTimestamp), AVG(data) avg,
FROM [data-sensing-lab:io_sensor_data.moscone_io13]
WHERE DATE(DTimestamp) IN ('2013-05-16', '2013-05-17')
GROUP BY 1, 2
ORDER BY 2, 3 DESC;
Generates a 3 column table: sensor type, date, and avg data. To "pivot" and have the dates as columns:
SELECT
SensorType,
AVG(IF(DATE(DTimestamp) = '2013-05-16', data, null)) d16,
AVG(IF(DATE(DTimestamp) = '2013-05-17', data, null)) d17
FROM [data-sensing-lab:io_sensor_data.moscone_io13]
GROUP BY 1
ORDER BY 2 DESC;