AnswerBun.com

Python Pandas cumsum with shift of n

Stack Overflow Asked by Gonzalo Polo on October 10, 2020

I would like to know if there is an efficient way (avoiding for loops) of doing a serie.cumsum() but with a shift of n.

The same way you can see serie.cumsum() like the inverse of serie.diff(1) I am looking for an inverse of diff(n) (I know that for a proper inverse you need the initial values but for simplicity I ignore them here) that could be called cumsum_shift

More explicitly implementing it with a for loop (which I would like to avoid):

def cumsum_shift(s, shift = 1, init_values = [0]):
    s_cumsum = pd.Series(np.zeros(len(s)))
    for i in range(shift):
        s_cumsum.iloc[i] = init_values[i]
    for i in range(shift,len(s)):
        s_cumsum.iloc[i] = s_cumsum.iloc[i-shift] + s.iloc[i]
    return s_cumsum

This code with shift = 1 is exactly the same that the s.cumsum() pandas method does but of course the pandas method do it in C code (I guess) so it is much faster (of course you should always use the s.cumsum() pandas method and not implement it yourself with a for loop).

My question then is
What would be the way of doing cumsum_shift avoiding a for loop with pandas methods?

Edit 1

Adding an example of input and output

If you call it with:

s = pd.Series([1,10,100,2,20,200,5,50,500])
s.diff(3)
out[26] 0      NaN
        1      NaN
        2      NaN
        3      1.0
        4     10.0
        5    100.0
        6      3.0
        7     30.0
        8    300.0
        dtype: float64

With this input, the ouput of cumsum_shift(s.diff(3), shift = 3, init_values = [1,2,3]) is again the original series s. Notice the shift of 3, this with just cumsum() e.g s.diff(3).cumsum() would not recover the original s:

cumsum_shift(s.diff(3), shift = 3, init_values= [1,10,100])
out[27]
0      1.0
1     10.0
2    100.0
3      2.0
4     20.0
5    200.0
6      5.0
7     50.0
8    500.0
dtype: float64

But let me emphasize that the initial values is not a big deal, a constant difference is not a problem. I would like to know how to perform a cumsum of shifted differenced serie without having to use a for loop

The same way that if you do a diff() and then a cumsum() you get back the orginal one up to the initial value:

s = pd.Series([1,10,100,2,20,200,5,50,500])
s.diff().cumsum()
out[28]
0      NaN
1      9.0
2     99.0
3      1.0
4     19.0
5    199.0
6      4.0
7     49.0
8    499.0
dtype: float64

I would like to know if there some clever way of doing something like s.diff(n).cumsum(n) that returned something correct up to some constant initial values.

EDIT 2 – Reverse a Moving Average

Thinking of an application of the "shifted cumsum" I found this other question in SO of how to reverse a moving average that I have answered using my cumsum_shift function and I think it clarifies more what I am asking here

One Answer

You can use the pandas method rolling.sum() among with sum:

s.rolling(shift).sum()

However you may want to fill the NaN values until the shift with the original df.

Answered by Elif on October 10, 2020

Add your own answers!

Related Questions

Are these Threads synchronized?

3  Asked on November 12, 2020 by haoshoku

     

How to pause and resume a while loop in Python?

5  Asked on November 11, 2020 by mentalcombination

   

How does thrust determine arguments to pass to functor

1  Asked on November 11, 2020 by a_man

     

How can I style specific symbols in an element?

6  Asked on November 10, 2020 by ankit-aggarwal

 

Flutter crash after open apps

4  Asked on November 10, 2020 by zukijuki

       

Java alternative of product function of python form itertools

1  Asked on November 8, 2020 by vipul-tyagi

     

Kubernetes – How to run local image of jenkins

1  Asked on November 8, 2020 by jerome12

   

How to avoid ambiguous template instantiation?

2  Asked on November 8, 2020 by wintergreen_plaza

     

leetcode algorithm edgecase issue

3  Asked on November 7, 2020 by stephen1993

     

Arrows in API strings?

1  Asked on November 6, 2020 by vichofs

   

How to use SQL PARTITION BY GROUPS?

2  Asked on November 5, 2020 by radagast

     

react start cannot find files in public folder

2  Asked on November 5, 2020 by minh-triet

     

tf.keras.utils.to_categorical raises TypeError in graph mode

1  Asked on November 5, 2020 by borun-chowdhury

   

Azure IoT Hub MQTT failure(Without SDK)

1  Asked on November 5, 2020 by govtham

   

Ask a Question

Get help from others!

© 2022 AnswerBun.com. All rights reserved. Sites we Love: PCI Database, MenuIva, UKBizDB, Menu Kuliner, Sharing RPP