TransWikia.com

How to speed up dataframe std() calculation on each row?

Stack Overflow Asked on December 1, 2021

I have a simple pandas dataframe and I need to get standard deviation values of each row depending on previous rows. I can do that easy with for loop, but the problem is it takes much time for calculation. For 1000 rows it took 4 seconds. Is there any way to speed up it?

Results:

       a
0      0
1      1
2      2
3      3
4      4
..   ...
995  995
996  996
997  997
998  998
999  999

10:21:18.320780 starting loop
10:21:22.861962 ending loop

       std
0      0.0
1      1.0
2      1.6
3      2.2
4      2.7
..     ...
995  574.9
996  575.5
997  576.1
998  576.6
999  577.2

Code:

import pandas as pd
import numpy as np
import math
from datetime import datetime

df = pd.DataFrame(data=np.arange(1000), columns=['a'])
print(df)

df_std = pd.DataFrame(0, index=np.arange(len(df)), columns=['std'])
print('{} starting loop'.format(datetime.now().strftime('%H:%M:%S.%f')))
for i in range(1, len(df_std)):
    su = np.sum([math.pow(df['a'].iloc[t], 2) for t in range(i + 1)])
    df_std['std'].iloc[i] = round(math.sqrt(su / i), 1)

print('{} ending loop'.format(datetime.now().strftime('%H:%M:%S.%f')))
print(df_std)

Updated:
I need to do something like this:

for i in range(1, len(df_std)):
    df_std['std'].iloc[i] = df['a'].rolling(window=i).std()

It means I need to get std() value for each df row with different rolling. For i=5 rolling will be first 5 df rows, for i=500 rolling will be 500 and so on.

2 Answers

Standard deviation calculation with respect to all previous row data included:

stds = df.a.expanding().std(ddof=0)
print(stds.head())

Output

0    0.0
1    0.5
2    0.8
3    1.1
4    1.4

Answered by Balaji Ambresh on December 1, 2021

I think no loop is necessary:

df = pd.DataFrame(data=np.arange(20), columns=['a'])

df['std'] = np.round(np.sqrt(np.power(df['a'], 2).cumsum() / np.arange(len(df))), 1)
print (df)
     a   std
0    0   NaN
1    1   1.0
2    2   1.6
3    3   2.2
4    4   2.7
5    5   3.3
6    6   3.9
7    7   4.5
8    8   5.0
9    9   5.6
10  10   6.2
11  11   6.8
12  12   7.4
13  13   7.9
14  14   8.5
15  15   9.1
16  16   9.7
17  17  10.2
18  18  10.8
19  19  11.4

Answered by jezrael on December 1, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP