Stack Overflow Asked by Gonzalo Polo on October 10, 2020
I would like to know if there is an efficient way (avoiding for loops) of doing a serie.cumsum()
but with a shift of n.
The same way you can see serie.cumsum()
like the inverse of serie.diff(1)
I am looking for an inverse of diff(n)
(I know that for a proper inverse you need the initial values but for simplicity I ignore them here) that could be called cumsum_shift
More explicitly implementing it with a for loop (which I would like to avoid):
def cumsum_shift(s, shift = 1, init_values = [0]):
s_cumsum = pd.Series(np.zeros(len(s)))
for i in range(shift):
s_cumsum.iloc[i] = init_values[i]
for i in range(shift,len(s)):
s_cumsum.iloc[i] = s_cumsum.iloc[i-shift] + s.iloc[i]
return s_cumsum
This code with shift = 1
is exactly the same that the s.cumsum()
pandas method does but of course the pandas method do it in C code (I guess) so it is much faster (of course you should always use the s.cumsum()
pandas method and not implement it yourself with a for loop).
My question then is
What would be the way of doing cumsum_shift
avoiding a for loop with pandas methods?
Adding an example of input and output
If you call it with:
s = pd.Series([1,10,100,2,20,200,5,50,500])
s.diff(3)
out[26] 0 NaN
1 NaN
2 NaN
3 1.0
4 10.0
5 100.0
6 3.0
7 30.0
8 300.0
dtype: float64
With this input, the ouput of cumsum_shift(s.diff(3), shift = 3, init_values = [1,2,3])
is again the original series s
. Notice the shift of 3, this with just cumsum()
e.g s.diff(3).cumsum()
would not recover the original s
:
cumsum_shift(s.diff(3), shift = 3, init_values= [1,10,100])
out[27]
0 1.0
1 10.0
2 100.0
3 2.0
4 20.0
5 200.0
6 5.0
7 50.0
8 500.0
dtype: float64
But let me emphasize that the initial values is not a big deal, a constant difference is not a problem. I would like to know how to perform a cumsum of shifted differenced serie without having to use a for loop
The same way that if you do a diff()
and then a cumsum()
you get back the orginal one up to the initial value:
s = pd.Series([1,10,100,2,20,200,5,50,500])
s.diff().cumsum()
out[28]
0 NaN
1 9.0
2 99.0
3 1.0
4 19.0
5 199.0
6 4.0
7 49.0
8 499.0
dtype: float64
I would like to know if there some clever way of doing something like s.diff(n).cumsum(n)
that returned something correct up to some constant initial values.
EDIT 2 – Reverse a Moving Average
Thinking of an application of the "shifted cumsum" I found this other question in SO of how to reverse a moving average that I have answered using my cumsum_shift
function and I think it clarifies more what I am asking here
You can use the pandas method rolling.sum() among with sum:
s.rolling(shift).sum()
However you may want to fill the NaN values until the shift with the original df.
Answered by Elif on October 10, 2020
3 Asked on November 12, 2020 by haoshoku
5 Asked on November 11, 2020 by mentalcombination
1 Asked on November 11, 2020 by a_man
2 Asked on November 10, 2020 by pallav
1 Asked on November 10, 2020 by arpit-bhadauria
4 Asked on November 10, 2020 by zukijuki
0 Asked on November 9, 2020 by eric
1 Asked on November 8, 2020 by vipul-tyagi
1 Asked on November 8, 2020 by kaan-e
1 Asked on November 8, 2020 by jerome12
2 Asked on November 8, 2020 by wintergreen_plaza
3 Asked on November 7, 2020 by stephen1993
2 Asked on November 5, 2020 by radagast
2 Asked on November 5, 2020 by minh-triet
1 Asked on November 5, 2020 by borun-chowdhury
1 Asked on November 5, 2020 by catas
1 Asked on November 5, 2020 by fiverbox-com
Get help from others!
Recent Answers
© 2022 AnswerBun.com. All rights reserved. Sites we Love: PCI Database, MenuIva, UKBizDB, Menu Kuliner, Sharing RPP