# Time series summary

Stack Overflow Asked by naanan_ on January 5, 2022

I was trying to sum numbers whose time lag is 1. i.e. I would like to summarize the rows by adding the frequencies of values where the days differ only by a single day within a particular group. I used the lag function to get the diff, but not sure how to proceed from here.

df <- df %>%
group_by(group) %>%
mutate(diff = dt - lag(dt))

df[!is.na(df$diff) & df$diff > 1,]$diff <- NA  For ex:  group dt freq diff groupA 2016-03-21 1 NA groupA 2016-03-22 1 1 groupA 2016-03-23 1 1 groupA 2016-03-26 2 NA groupA 2016-03-28 1 NA groupA 2016-03-29 3 1 groupA 2016-03-30 3 1 groupA 2016-03-31 5 1 groupB 2016-04-01 1 NA groupB 2016-04-02 2 1  I need to group this into: group dt freq diff duration groupA 2016-03-21 1 NA 3 (1 + 1 + 1) groupA 2016-03-22 1 1 groupA 2016-03-23 1 1 groupA 2016-03-26 2 NA 2 groupA 2016-03-28 1 NA 12(1 + 3 + 3 + 5) groupA 2016-03-29 3 1 groupA 2016-03-30 3 1 groupA 2016-03-31 5 1 groupB 2016-04-01 1 NA 3(1 + 2) groupB 2016-04-02 2 1  Also referred to this, but cumulative does not work as I do not consider jumps more than a single day apart. Is looping in a custom function the only way? ## 2 Answers You can do it much easier with this approach (grouping rows with less.than 1 day difference); this will create a helper column gap which later will be used to sum the freq for consecutive days in the same group: library(dplyr) df %>% mutate(gap = cumsum(!c(TRUE, diff(as.Date(df$dt)) == 1)))  %>%
group_by(gap, group) %>%
mutate(duration = sum(freq, na.rm=TRUE)) %>%
ungroup %>% select(-gap) %>% as.data.frame

#            group         dt freq duration
#        1  groupA 2016-03-21    1        3
#        2  groupA 2016-03-22    1        3
#        3  groupA 2016-03-23    1        3
#        4  groupA 2016-03-26    2        2
#        5  groupA 2016-03-28    1       12
#        6  groupA 2016-03-29    3       12
#        7  groupA 2016-03-30    3       12
#        8  groupA 2016-03-31    5       12
#        9  groupB 2016-04-01    1        3
#        10 groupB 2016-04-02    2        3


Answered by M-- on January 5, 2022

Here is a tidyverse solution using dplyr::lead:

library(tidyverse);
df %>%
mutate(dt = as.POSIXct(dt)) %>%
group_by(group) %>%
mutate(
diff = pmin(c(1, diff(dt)), c(1, diff(lead(dt))), na.rm = T),
id = cumsum(c(TRUE, diff(diff) != 0) | diff > 1)) %>%
group_by(group, id) %>%
mutate(duration = sum(freq)) %>%
ungroup() %>%
select(-diff, -id)
## A tibble: 10 x 4
#   group  dt                   freq duration
#   <fct>  <dttm>              <int>    <int>
# 1 groupA 2016-03-21 00:00:00     1        3
# 2 groupA 2016-03-22 00:00:00     1        3
# 3 groupA 2016-03-23 00:00:00     1        3
# 4 groupA 2016-03-26 00:00:00     2        2
# 5 groupA 2016-03-28 00:00:00     1       12
# 6 groupA 2016-03-29 00:00:00     3       12
# 7 groupA 2016-03-30 00:00:00     3       12
# 8 groupA 2016-03-31 00:00:00     5       12
# 9 groupB 2016-04-01 00:00:00     1        3
#10 groupB 2016-04-02 00:00:00     2        3


Explanation: diff chooses the minimum difference between the preceding and following date. We then look for changes in diff, and create a new grouping vector id by which we calculate the summary metric sum(freq).

## Sample data

df <- read.table(text =
" group     dt           freq  diff
groupA    2016-03-21    1     NA
groupA    2016-03-22    1     1
groupA    2016-03-23    1     1
groupA    2016-03-26    2     NA
groupA    2016-03-28    1     NA
groupA    2016-03-29    3     1
groupA    2016-03-30    3     1
groupA    2016-03-31    5     1
groupB    2016-04-01    1     NA
groupB    2016-04-02    2     1 ", header = T)


## Update

# Sample data
" group     dt           freq  diff
groupA    2016-03-21    1     NA
groupA    2016-03-22    1     1
groupA    2016-03-23    1     1
groupA    2016-03-26    2     NA
groupA    2016-03-28    1     NA
groupA    2016-04-01    3     1
groupA    2016-04-02    3     1
groupA    2016-04-03    5     1
groupB    2016-04-01    1     NA
groupB    2016-04-02    2     1 ", header = T)

df %>%
mutate(dt = as.POSIXct(dt)) %>%
group_by(group) %>%
mutate(
diff = pmin(c(1, diff(dt)), c(1, diff(lead(dt))), na.rm = T),
id = cumsum(c(TRUE, diff(diff) != 0) | diff > 1)) %>%
group_by(group, id) %>%
mutate(duration = sum(freq)) %>%
ungroup() %>%
select(-diff, -id);
## A tibble: 10 x 4
#   group  dt                   freq duration
#   <fct>  <dttm>              <int>    <int>
# 1 groupA 2016-03-21 00:00:00     1        3
# 2 groupA 2016-03-22 00:00:00     1        3
# 3 groupA 2016-03-23 00:00:00     1        3
# 4 groupA 2016-03-26 00:00:00     2        2
# 5 groupA 2016-03-28 00:00:00     1        1
# 6 groupA 2016-04-01 00:00:00     3       11
# 7 groupA 2016-04-02 00:00:00     3       11
# 8 groupA 2016-04-03 00:00:00     5       11
# 9 groupB 2016-04-01 00:00:00     1        3
#10 groupB 2016-04-02 00:00:00     2        3


Answered by Maurits Evers on January 5, 2022

## Related Questions

### Javascript returning NaN when multiplying

0  Asked on January 21, 2021 by lm_margaux

### How to test an express API endpoint on a failing condition, using jest and supertest?

0  Asked on January 20, 2021 by kob003

### How to use For Loop inside the map function in ReactJS

1  Asked on January 20, 2021 by muzamil-hussain

### Upgrade .then .catch to async await and try catch

2  Asked on January 20, 2021 by sonetlumiere

### Using Preact over React

0  Asked on January 20, 2021

### Pandas sum of last four not nan values

1  Asked on January 20, 2021 by luca-r

### Google Apps Script function to copy and paste a previous i value within a forEach loop

2  Asked on January 20, 2021 by martin-luengas

### Have the class as value in a dictionary

4  Asked on January 20, 2021 by bows

### C# .NET Core 3.1 Entity Framework : include using where condition gives wrong output

2  Asked on January 20, 2021

### Exported Vue Web Component didnt uses boostrap

0  Asked on January 20, 2021 by creekly

### Add Speed to WASD Controls for A-Frame

2  Asked on January 19, 2021 by dionoh

### org.json.JSONException: JSONArray[0] is not a JSONObject – Java

3  Asked on January 19, 2021 by noobcoder

### Calculate the address of the subnet

2  Asked on January 19, 2021 by kingvince

### How do I get email information from ec2 instance?

2  Asked on January 19, 2021 by zanam

### Is there a function where you can convert a template into a string of HTML? (Laravel Blade)

1  Asked on January 19, 2021 by user12297722

### Stored variable of Self type (especially when subclassing)

2  Asked on January 19, 2021

### Need help in grouping rows by year and differentiating months

1  Asked on January 19, 2021

### MS Access Syntax Error (Missing Operator) in query expression NOT EXISTS

0  Asked on January 19, 2021 by waterhearts

### IBM MQ Client running under Windows Docker

2  Asked on January 19, 2021 by lumberjack

### How to reproduce cells that are in the same row a certain time?

3  Asked on January 19, 2021 by snm