Apply mode operation on categorical data in SQL
December 13, 2021I have categorical log data in bigquery database that I want to process based on sliding window. I want to apply MODE operation on a window of size 3 or 5, so that one-off events or category changes are discarded.
|SysDT | Power_State | Target |
| -------- | -------- | -------- |
|2021-07-01 09:03:57+00:00| EDC | EDC |
|2021-07-01 09:08:57+00:00| EDC | EDC |
|2021-07-01 09:13:57+00:00| DWN | EDC |
|2021-07-01 09:18:57+00:00| EDC | EDC |
|2021-07-01 09:23:58+00:00| EDC | EDC |
|2021-07-01 09:28:59+00:00| DWN | EDC |
|2021-07-01 09:33:59+00:00| EDC | EDC |
I try to use the OVER operator that gives me the required sliding window but next I need a custom MODE operator. Any idea in modifying this query to avoid such MODE function or writing a custom MODE function in bigquery?
SELECT *, MODE(Power_State)
OVER(ORDER BY SysDT ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING) as Target
FROM Master_Data.2021_07
ORDER BY SysDT
Any help really appreciated. Thanks