TransWikia.com

getting pivot table in pandas in the desired format

Stack Overflow Asked on December 5, 2021

I have following dataframe

Categories                     Owner
Product_Query                   products_team                   
Product_Query                   products_team
Infrastructure_Query            infra_team
Bug_Query                      development_team
Bug_Query                       development_team

I want to display the above data in the following format

Categories             Total    products_team    infra_team   development_team  
Product_Query              2        2                 0           0                                     
Infrastructure_Query       1        0                  1          0                                 
Bug_Query                  2        0                  0          2  
       

I am doing the following

df["count"]=1

df_pivot=df.pivot_table(index=["Catagories"],columns=["Owner"],values=["count"],aggfunc=[np.sum],fill_value=0)

But I am not getting the output in the correct format.How can I get the output in the format mentioned above?

One Answer

You can try using pd.crosstab:

pd.crosstab(df['Categories'], df['Owner'], margins=True, margins_name='Total')

Output:

Owner                 development_team  infra_team  products_team  Total
Categories                                                              
Bug_Query                            2           0              0      2
Infrastructure_Query                 0           1              0      1
Product_Query                        0           0              2      2
Total                                2           1              2      5

And, if you don't want the row Total you can drop it:

df_out = pd.crosstab(df['Categories'], df['Owner'], margins=True, margins_name='Total')
df_out = df_out.drop('Total')
df_out

Output:

Owner                 development_team  infra_team  products_team  Total
Categories                                                              
Bug_Query                            2           0              0      2
Infrastructure_Query                 0           1              0      1
Product_Query                        0           0              2      2

Or using pivot_table:

df.assign(counts=1)
  .pivot_table('counts','Categories','Owner','sum', 
               fill_value=0, margins=True, margins_name='Total')
  .drop('Total')

Output:

Owner                 development_team  infra_team  products_team  Total
Categories                                                              
Bug_Query                            2           0              0      2
Infrastructure_Query                 0           1              0      1
Product_Query                        0           0              2      2

Or using groupby:

df_out = df.groupby(['Categories', 'Owner'])['Owner'].count().unstack(fill_value=0)
df_out['Total'] = df_out.sum(1)

Output:

Owner                 development_team  infra_team  products_team  Total
Categories                                                              
Bug_Query                            2           0              0      2
Infrastructure_Query                 0           1              0      1
Product_Query                        0           0              2      2

Answered by Scott Boston on December 5, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP