Search code examples
pythonpandasexceldataframe

Creating a custom desribe summary table for a number of columns


I have this dataset:

d = {'eue_scanner': [1,2,3,4,5],
     'eue_thinclient': [12,1221,2123,1231, 123],
     'service_model': ['Gold', 'Bronze', 'Bronze', 'Silver', 'Silver']}

df = pd.DataFrame(data=d)

I want to create a describe table that will look like this (ignore actual numbers):

enter image description here

df['eue_thinclient'].describe().to_csv("my_description.csv")

but I would want all these summary tables in this formatting for the rest. Here is the code I have:

for smf, df_smf in df_final.groupby('service_model'):
    for col in columns_for_stats:
        print(smf,"-", col)
        print(df_smf[col].describe())
        df_smf[col].describe().to_csv("my_description.csv")
        break
    break

The column_for_stats list is just this:

column_for_stats = ['eue_scanner', 'eue_thinclient']

but for all the different combinations. Here is the expected csv output in picture form

enter image description here

Any help is appreciated.


Solution

  • You're looking for an Excel spreadsheet and not a csv :

    def fn(ser):
        return (
            ser.describe().to_frame().reset_index()
            .style.set_properties(**{"border": "1px solid"})
        )
    
    from itertools import product
    combs = product(df.groupby("service_model"), columns_for_stats)
    
    N = 3 # maximum number of tables per bloc
    SR, SC = 1, 2 # starts of row/col (Excel's index)
    HS, VS = 1, 2 # number of separators (empty row/col)
    
    with pd.ExcelWriter("output.xlsx", engine="xlsxwriter") as writer:
        for idx, ((sm, g), col) in enumerate(combs):
            sr, sc = (idx // N) * (8 + VS+1) + (SR-1), (idx % N) * (2 + HS) + (SC-1)
            fn(g[col]).to_excel(writer, startrow=sr, startcol=sc, index=False)
            fmt = writer.book.add_format({"align": "center", "bold": True, "border": 1})
            writer.sheets["Sheet1"].merge_range(sr, sc, sr, sc+1, f"{sm} - {col}", fmt)
    

    Output :

    enter image description here