AnswerBun.com

How to use SQL PARTITION BY GROUPS?

Stack Overflow Asked by Radagast on November 5, 2020

I’m working with PostgreSQL 12, but the question is standard SQL.
I have a table like this:

| timestamp                | raw_value |
| ------------------------ | --------- |
| 2015-06-27T03:52:50.000Z | 0         |
| 2015-06-27T03:53:00.000Z | 0         |
| 2015-06-27T03:53:10.000Z | 1         |
| 2015-06-27T03:53:20.000Z | 1         |
| 2015-06-27T04:22:20.000Z | 1         |
| 2015-06-27T04:22:30.000Z | 0         |
| 2015-06-27T05:33:40.000Z | 1         |
| 2015-06-27T05:33:50.000Z | 1         |

I need to get the first and last timestamp of each group with raw_value = 1, i.e. needed result :

| start_time               | end_time                 |
| ------------------------ | ------------------------ |
| 2015-06-27T03:53:10.000Z | 2015-06-27T04:22:20.000Z |
| 2015-06-27T05:33:40.000Z | 2015-06-27T05:33:50.000Z |

My best effort so far looks like this:

SELECT timestamp, raw_value, row_number() over w as rn, first_value(obt) OVER w AS start_time, last_value(obt) OVER w AS end_time
FROM mytable
WINDOW w AS (PARTITION BY raw_value ORDER BY timestamp GROUPS CURRENT ROW )
ORDER BY timestamp;

Google doesn’t have much info about it, but according to the docs the "GROUPS" clause is exactly what I need, but the end result is wrong, because window functions simply copy value from the timestamp column:

| timestamp                | raw_value | rn  | start_time               | end_time                 |
| ------------------------ | --------- | --- | ------------------------ | ------------------------ |
| 2015-06-27T03:52:50.000Z | 0         | 1   | 2015-06-27T03:52:50.000Z | 2015-06-27T03:52:50.000Z |
| 2015-06-27T03:53:00.000Z | 0         | 2   | 2015-06-27T03:53:00.000Z | 2015-06-27T03:53:00.000Z |
| 2015-06-27T03:53:10.000Z | 1         | 1   | 2015-06-27T03:53:10.000Z | 2015-06-27T03:53:10.000Z |
| 2015-06-27T03:53:20.000Z | 1         | 2   | 2015-06-27T03:53:20.000Z | 2015-06-27T03:53:20.000Z |
| 2015-06-27T04:22:20.000Z | 1         | 3   | 2015-06-27T04:22:20.000Z | 2015-06-27T04:22:20.000Z |
| 2015-06-27T04:22:30.000Z | 0         | 3   | 2015-06-27T04:22:30.000Z | 2015-06-27T04:22:30.000Z |
| 2015-06-27T05:33:40.000Z | 1         | 4   | 2015-06-27T05:33:40.000Z | 2015-06-27T05:33:40.000Z |
| 2015-06-27T05:33:50.000Z | 1         | 5   | 2015-06-27T05:33:50.000Z | 2015-06-27T05:33:50.000Z |

At line#6 I’d expect the row number to reset to 1, but it doesn’t! I tried using BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING as well without luck.

I have created a DB Fiddle link for your convenience as well.

If there is any other way to achieve the same result in SQL (ok to be PG-specific) without window functions, I’d like to know.

2 Answers

Identify groups using row_number() - sum() trick, then choose min and max time for each identified group.

with grp as (
  select obt, raw_value
       , row_number() over w - sum(raw_value) over w as g
  from tm_series
  window w as (order by obt)
)
select min(obt), max(obt)
from grp
where raw_value = 1
group by g;

DB fiddle here.

(The GROUPS clause depends on window ordering and seems to have nothing common with your problem.)

Correct answer by Tomáš Záluský on November 5, 2020

Your updated fiddle here.

For an gaps and islands approach, first mark your transitions from raw_value = 0 to raw_value = 1

with mark_changes as (
  select obt, raw_value,
         case
           when raw_value = 0 then 0
           when raw_value = lag(raw_value) over (order by obt) then 0
           else 1
         end as transition
    from tm_series
), 

Keep only the raw_value = 1 rows, and sum() the preceding transition markers to place each row into a group.

id_groups as (
  select obt, raw_value, 
         sum(transition) over (order by obt) as grp_num
    from mark_changes
   where raw_value = 1
)

Use group by on these grp_num values to get your desired result.

select min(obt) as start_time, 
       max(obt) as end_time
  from id_groups
 group by grp_num
 order by min(obt);

Answered by Mike Organek on November 5, 2020

Add your own answers!

Related Questions

“Live” data capable alternative for Google Earth KML

1  Asked on February 15, 2021 by christoph1197

   

React component dispatching generic payload

1  Asked on February 14, 2021 by andyroo

   

CSS animted gradient

1  Asked on February 14, 2021

     

Running a value through an array of functions

2  Asked on February 14, 2021 by lars-holdaas

   

How to revert back to python 2.7?

1  Asked on February 14, 2021 by nikhil-shrivastava

         

Using same name for attribute and getter

1  Asked on February 13, 2021 by janpeterka

     

What is the purpose to use className beside functionName using colon(:)

1  Asked on February 13, 2021 by mahedi-hasan-durjoy

 

Python: access a variable in a module from another file

2  Asked on February 13, 2021 by yeroduk

 

How to get elements of a sublist in Java

5  Asked on February 13, 2021 by alvira

     

`SyntaxError` in if-else one-liner

2  Asked on February 13, 2021 by evgeniy-golovin

         

Ask a Question

Get help from others!

© 2022 AnswerBun.com. All rights reserved. Sites we Love: PCI Database, MenuIva, UKBizDB, Menu Kuliner, Sharing RPP