TransWikia.com

Write Pandas DataFrame to chronologic excel sheet

Stack Overflow Asked by Anna Bleha on January 3, 2021

I am trying to achieve the following, on which I’m stuck onfortunately.

I have a dataframe containing:

  • A number of material groups, which contain varying numbers of
  • materials
  • their amounts in kg
  • and their delivery date (between 2015 and 2020)

What I want to do is sum up the amounts per month and create an excelsheet which gives me the amounts per month per year – as in:

               Jan     Feb
Material 1     200kg   100kg
Material 2     500kg   90kg
Material 3     200kg   400kg

and so on.

And for each year I want to create a separate sheet in the excel file. I have manage this so far with the following script (please excuse that I had to cut it down for both simplicity in reading and for confidental reasons, but the essential parts are in there)

df = pd.read_excel(r"C:/.../test.xlsx")

#Additional definitions 

df['materialgroup'] = df['Produkthierarchie'].str[2:6]
df2['mon'] = df2['date'].dt.strftime('%m')
df2['year'] = df2['date'].dt.strftime('%y')

relDfs = {}

for materialgroup in df3["materialgroup"].unique():                            
    df4 = df[df["materialgroup"]==materialgroup]                             
    totalWeight = df4['amount'].sum()            
    relDfs[materialgroup] = pd.DataFrame()                              
    monthlyDfs[materialgroup] = pd.DataFrame()                        

    
    for prod in df4['Material'].unique():                     
        prodDf = df4[df4['Material'] == prod]                   
        kg=prodDf['amount'].sum()                
        rel = kg/totalWeight                                  
                                     
        
        relDfs[materialgroup] = relDfs[materialgroup].append({"Material" : prod,
                                               "KG"        : kg,
                                               "rel"       : rel,
                                               }, ignore_index=True)                       
        
        #converting of stuff
        prodDf['date'] = pd.to_datetime(prodDf['date'], errors='coerce')          
        
        prodDf['month'] = prodDf['date'].apply(lambda x: f" {x.month:02}")                    
        prodDf['month-year'] = prodDf['date'].apply(lambda x: f"{x.month:02}.{x.year}")
        monthlyEntry = {}
        x = df2['year'].unique()
        x = sorted(x)
        for month in prodDf['month'].unique():
            for year in x:
                monthlyEntry[str(month)] = prodDf[prodDf['month'] == month]['amount'].sum()
        
            
            monthlyDfs[materialgroup] = monthlyDfs[materialgroup].append(monthlyEntry, ignore_index=True)
            
        for level in relDfs:
            relDf = relDfs[level]
            monthlyDf = monthlyDfs[level]
            monthlyDf2 = monthlyDf.reindex(sorted(monthlyDf.columns), axis=1)
            
            
            with pd.ExcelWriter(f"C:/.../test/{level}.xlsx") as writer:
                relDf.to_excel(writer, sheet_name='rel_Amount', index=False)
                x=str(x)
                monthlyDf2.to_excel(writer, sheet_name=x, index=True)

        

Unfortunately, only one sheet is created, which is constantly overwritten as far as I can tell from checking the explorer, and not all of them (from 2015 to 2020).

I assume the issue might be either with using the list x = df2[‘year’].unique(), or with me having to close the excel file after every loop.

I checked a few other questions regarding that topic ( How to save a new sheet in an existing excel file, using Pandas? ) but none of them seemded to help me fully.

Any help towards that issue is highly appreciated.
BR

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP