TransWikia.com

How do I do calculation between rows in pandas dataframe group by certain variable?

Stack Overflow Asked on January 1, 2022

I have a dataframe like this:

Time      Name     Value
2007Q1    A        30
2007Q2    A        35
2007Q3    A        28
...
2007Q1    B        31
2007Q2    B        50
2007Q3    B        60
...
2007Q1    C        20
2007Q2    C        15
2007Q3    C        30

I want to add another column called Results and perform calculations between each row for each Name. I want to use the value for a quarter divided by the value for the previous quarter and then minus 1, which is similar as Value(Q2)/Value(Q1)-1. Also, I want to group by Name, only do the calculation within the rows with the same name. The results should be like:

Time      Name     Value    Results
2007Q1    A        30       
2007Q2    A        35       0.1667
2007Q3    A        28       -0.2
...
2007Q1    B        31       
2007Q2    B        50       0.6129
2007Q3    B        60       0.2
...
2007Q1    C        20
2007Q2    C        15       -0.25
2007Q3    C        30       1

The starting time period for each ‘Name’ should have no value for Results.

Thanks to everyone who can help!

One Answer

Use DataFrame.groupby on Name and use groupby.shift to shift the column Value then use Series.div to divide it with Value, finally use Series.sub to subtract 1:

df['Results'] = df['Value'].div(df.groupby('Name')['Value'].shift()).sub(1)

Result:

print(df)
     Time Name  Value   Results
0  2007Q1    A     30       NaN
1  2007Q2    A     35  0.166667
2  2007Q3    A     28 -0.200000
3  2007Q1    B     31       NaN
4  2007Q2    B     50  0.612903
5  2007Q3    B     60  0.200000
6  2007Q1    C     20       NaN
7  2007Q2    C     15 -0.250000
8  2007Q3    C     30  1.000000

Answered by Shubham Sharma on January 1, 2022

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP