Site icon Beauty Confessional

Apply mode operation on categorical data in SQL

I have categorical log data in bigquery database that I want to process based on sliding window. I want to apply MODE operation on a window of size 3 or 5, so that one-off events or category changes are discarded.

|SysDT | Power_State | Target |
| -------- | -------- | -------- |
|2021-07-01 09:03:57+00:00| EDC | EDC   |
|2021-07-01 09:08:57+00:00| EDC | EDC   |
|2021-07-01 09:13:57+00:00| DWN | EDC   |
|2021-07-01 09:18:57+00:00| EDC | EDC   |
|2021-07-01 09:23:58+00:00| EDC | EDC   |
|2021-07-01 09:28:59+00:00| DWN | EDC   |
|2021-07-01 09:33:59+00:00| EDC | EDC   |

I try to use the OVER operator that gives me the required sliding window but next I need a custom MODE operator. Any idea in modifying this query to avoid such MODE function or writing a custom MODE function in bigquery?

SELECT *, MODE(Power_State) 
    OVER(ORDER BY SysDT ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING) as Target
FROM Master_Data.2021_07
ORDER BY SysDT

Any help really appreciated. Thanks

Exit mobile version