TransWikia.com

Categorize column according to lists and aggregate with result

Stack Overflow Asked on November 12, 2021

Let’s say I have a dataframe as follows:

d = {'name': ['spain', 'greece','belgium','germany','italy'], 'davalue': [3, 4, 6, 9, 3]}
df = pd.DataFrame(data=d)
index name  davalue
0    spain      3
1    greece     4
2    belgium    6
3    germany    9
4    italy      3

I would like to aggregate and sum based on a list of strings in the name column. So for example, I may have: southern=['spain', 'greece', 'italy'] and northern=['belgium','germany'].

My goal is to aggregate by using sum, and obtain:

index name  davalue
0   southern    10
1   northen     15

where 10=3+4+3 and 15=6+9

I imagined something like:

df.groupby(by=[['spain','greece','italy'],['belgium','germany']])

could exist. The docs say

A label or list of labels may be passed to group by the columns in self

but I’m not sure I understand what that means in terms of syntax.

3 Answers

df["regional_group"]=df.apply(lambda x: "north" if x["home_team_name"] in ['belgium','germany'] else "south",axis=1)

You create a new column by which you later groubpy.

df.groupby("regional_group")["davavalue"].sum()

Answered by Borut Flis on November 12, 2021

One way could be using np.select and using the result as a grouper:

import numpy as np

southern=['spain', 'greece', 'italy']
northern=['belgium','germany']

g = np.select([df.name.isin(southern),
               df.name.isin(northern)],
              ['southern', 'northern'],
              'others')

df.groupby(g).sum()

          davalue
northern       15
southern       10

Answered by yatu on November 12, 2021

I would build a dictionary and map:

d = {v:'southern' for v in southern}
d.update({v:'northern' for v in northern})

df['davalue'].groupby(df['name'].map(d)).sum()

Output:

name
northern    15
southern    10
Name: davalue, dtype: int64

Answered by Quang Hoang on November 12, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP