TransWikia.com

Using multiple columns to while mapping dictionary to dataframe

Stack Overflow Asked by dmd7 on January 3, 2022

Looking to use multiple columns for creating a new column while using a dictionary to create the new columns values. Simple example below:

df:

Col1     Col2    Col3
Dog      Bird    Cat
Blue     Red     Black
Bad      Sad     Glad

my_dict = {'Bird': 'AAA','Blue':'BBB','Glad':'ZZZ'}

desired df:

Col1     Col2    Col3      NewCol
Dog      Bird    Cat       AAA
Blue     Red     Black     BBB
Bad      Sad     Glad      ZZZ

I’ve played around with the map function (df.NewCol = df.Col.map(my_dict))… but it only allows me to use one column to search for the keys in my dictionary. I need the Col1, Col2, AND Col3 columns to search through my dictionary in order to create NewCol.

Any ideas? thanks!

4 Answers

Using more Python stuff in a comprehension

This is more obtuse... but I think it's fun. Likely faster in some contexts but probably not worth the added confusion.

df.assign(NewCol=[min(map(my_dict.get, t), key=pd.isna) for t in zip(*map(df.get, df))])

   Col1  Col2   Col3 NewCol
0   Dog  Bird    Cat    AAA
1  Blue   Red  Black    BBB
2   Bad   Sad   Glad    ZZZ

Answered by piRSquared on January 3, 2022

If a row has one key and one key only, another approach would be chaining map, ravel and dropna as below:

df['NewCol'] = pd.Series(df.apply(lambda x: x.map(my_dict)).values.ravel()).dropna().values

Output:

   Col1  Col2   Col3 NewCol
0   Dog  Bird    Cat    AAA
1  Blue   Red  Black    BBB
2   Bad   Sad   Glad    ZZZ

Answered by nimbous on January 3, 2022

Another way uses replace on dataframe and compare against df and ffill

df['NewCol'] = df.replace(my_dict).where(lambda x: x != df).ffill(1).iloc[:,-1]

Out[550]:
   Col1  Col2   Col3 NewCol
0   Dog  Bird    Cat    AAA
1  Blue   Red  Black    BBB
2   Bad   Sad   Glad    ZZZ

Or Use stack, droplevel

df['NewCol'] = df.replace(my_dict).where(lambda x: x != df).stack().droplevel(1)

Answered by Andy L. on January 3, 2022

Option 1: apply map with ffill. This doesn't assume one valid entry per row.

# this will take the last occurrence of valid entry in a row
# change to .bfill(1).iloc[:,0] to get the first
df['NewCol'] = df.apply(lambda x: x.map(my_dict)).ffill(1).iloc[:,-1]

Option 2: map on stack and assign. This approach assumes only one valid entry per row.

df['NewCol'] = (df.stack().map(my_dict)
                  .reset_index(level=1, drop=True)
                  .dropna()
               )

Output:

   Col1  Col2   Col3 NewCol
0   Dog  Bird    Cat    AAA
1  Blue   Red  Black    BBB
2   Bad   Sad   Glad    ZZZ

Answered by Quang Hoang on January 3, 2022

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP