
以下版本适用于BigQuery标准SQL,仅使用纯SQL(无JS UDF)
#standardSQLWITH `project.dataset.events` AS ( SELECt 1 dt,'add' event,'1' value UNIOn ALL SELECT 2, 'remove', '1' UNIOn ALL SELECT 6, 'add', '2' UNIOn ALL SELECT 8, 'add', '3' UNIOn ALL SELECT 11, 'add', '4' UNIOn ALL SELECT 23, 'remove', '3' ), cum AS ( SELECT dt, event, value, SUM(IF(event = 'add', 1, -1)) OVER(PARTITION BY value ORDER BY dt) state FROM `project.dataset.events`), pre AS ( SELECt a.dt, a.event, a.value, a.state, b.value AS b_value, ARRAY_AGG(b.state ORDER BY b.dt DESC)[SAFE_OFFSET(0)] b_state, MAX(b.dt) b_dt FROM cum a JOIN cum b ON b.dt <= a.dt GROUP BY a.dt, a.event, a.value, a.state, b.value)SELECt dt, event, value, SPLIT(IFNULL(STRING_AGG(IF(b_state = 1, b_value, NULL) ORDER BY b_dt), '')) list_as_array, CONCAt('[', IFNULL(STRING_AGG(IF(b_state = 1, b_value, NULL) ORDER BY b_dt), ''), ']') list_as_stringFROM preGROUP BY dt, event, valueORDER BY dt结果是“令人惊讶”:o)与我之前回答/发布的JS UDF版本完全相同
Row dt event value list_as_arr list_as_string 1 1 add 1 1[1] 2 2 remove 1 [] 3 6 add 2 2[2] 4 8 add 3 2[2,3] 3 5 11 add 4 2[2,3,4] 3 4 6 23 remove 3 2[2,4] 4
注意:我认为以上可能有点过分设计-但我只是没有时间潜在地完善/优化它-应该是可行的-这要由问题所有者负责
欢迎分享,转载请注明来源:内存溢出
微信扫一扫
支付宝扫一扫
评论列表(0条)