Search code examples
excelpandaspython-3.5transpose

Transposing Data in Pandas


I have an excel file that contains a count of how many times an Part has been used during its lifespan. The data is currently stored in such a way that the Serial numbers are in Column A, and each "Lifespan" Count is stored in adjacent columns, with a "Date" value as its heading.
Here is an example:

Image1

I want to be able to pivot/transpose ALL of the Date columns in Python so that the output is in the following format with the Lifespan Count as a new column named "Count":

Image2

I've tried using the pivot function in pandas to flip the appropriate rows & columns around, however the problem is that I'd have to define each column one at a time. The nature of the spreadsheet used in this example is just one of several spreadsheets I use that all start at a different Date (some start at 01/01/2010 to present while others start at 01/01/1999 to present). It is worth noting that all spreadsheets start at the first of every month and increment 1 month at a time.

Therefore is there a way I can just select ALL the columns I've loaded into a Dataframe via the spreadsheet loaded to be able to Pivot/Transpose the data as mentioned?

pivot = df.pivot_table(index=['Serial'], values=['01/01/2019','01/02/2019',... etc], aggfunc='max'])

Solution

  • Simple use of stack() if you have named your column series and have set_index() on your rows.

    import datetime as dt
    import random
    df = pd.DataFrame({**{"Serial":[s for s in range(1111,1121)]}, 
     **{d:[random.randint(0,150) for i in range(10)] for d in pd.date_range(start=dt.date(2019,1,1), end=dt.date(2019,1,6))}
    }).set_index("Serial")
    df.columns.set_names("Date", inplace=True)
    dfs = df.stack().to_frame().rename(columns={0:"Count"})
    
    print(f"{df.iloc[:,:3].to_string()}\n\n{dfs.iloc[:10].to_string()}")
    

    sample output

    Date    2019-01-01 00:00:00  2019-01-02 00:00:00  2019-01-03 00:00:00
    Serial                                                               
    1111                    134                   76                    8
    1112                    115                   37                    8
    1113                     16                   23                  130
    1114                    108                   45                   69
    1115                      0                   51                  111
    1116                     51                   71                  101
    1117                    107                   10                  142
    1118                     80                   48                  134
    1119                    148                   62                   75
    1120                      4                  135                  100
    
                                Count
    Serial Date                      
    1111   2019-01-01 00:00:00    134
           2019-01-02 00:00:00     76
           2019-01-03 00:00:00      8
           2019-01-04 00:00:00     16
           2019-01-05 00:00:00    140
           2019-01-06 00:00:00     28
    1112   2019-01-01 00:00:00    115
           2019-01-02 00:00:00     37
           2019-01-03 00:00:00      8
           2019-01-04 00:00:00     56