Search code examples
pythonpandasreshapedata-transform

Advance grouping and reshaping transformation using pandas


I have a pandas dataframe like this.

Territory_id    client_id   patient_id  Total Clinic    Clinic Number   Attribute.2         Value
    43              172         6021        1               Clinic 1    Service Date      06/22/2017 
    43              172         6021        1               Clinic 1    Product              A
    43              172         6021        1               Clinic 1    Qty                  1
    43              172         6021        1               Clinic 1    Amount              80.63
    43              172         6021        1               Clinic 1    Age                11 y 4 m 
    43              172         6021        1               Clinic 1    Days Last Clinic    0
    43              172         6137        1               Clinic 1    Service Date      06/22/2017 
    43              172         6137        1               Clinic 1    Product              B
    43              172         6137        1               Clinic 1    Qty                  1
    43              172         6137        1               Clinic 1    Amount              80.63
    43              172         6137        1               Clinic 1    Age                  7 y  
    43              172         6137        1               Clinic 1    Days Last Clinic     0
    43              187         5658        5               Clinic 1    Service Date      06/07/2017 
    43              187         5658        5               Clinic 1    Product               C
    43              187         5658        5               Clinic 1    Qty                   1
    43              187         5658        5               Clinic 1    Amount                0
    43              187         5658        5               Clinic 1    Age                9 y 1 m 
    43              187         5658        5               Clinic 1    Days Last Clinic     0
    43              187         5658        5               Clinic 2    Service Date        06/30/2017 
    43              187         5658        5               Clinic 2    Product               D
    43              187         5658        5               Clinic 2    Qty                   2
    43              187         5658        5               Clinic 2    Amount              52.48
    43              187         5658        5               Clinic 2    Age                 9 y 1 m 
    43              187         5658        5               Clinic 2    Days Last Clinic    23
    43              187         5658        5               Clinic 3    Service Date       09/12/2017 
    43              187         5658        5               Clinic 3    Product               E
    43              187         5658        5               Clinic 3    Qty                   3
    43              187         5658        5               Clinic 3    Amount              78.72
    43              187         5658        5               Clinic 3    Age                9 y 4 m 
    43              187         5658        5               Clinic 3    Days Last Clinic    74
    43              187         5658        5               Clinic 4    Service Date       09/05/2018 
    43              187         5658        5               Clinic 4    Product               F
    43              187         5658        5               Clinic 4    Qty                   2
    43              187         5658        5               Clinic 4    Amount              53.72
    43              187         5658        5               Clinic 4    Age                 10 y 4 m 
    43              187         5658        5               Clinic 4    Days Last Clinic     358

I want the results to be reshaped by Attribute.2 column names. There are 6 attributes Service Datea, Product, Qty, Amount, Age, Day Last Clinic. I want these columns should be moved to 1 row and other column values should be picked up with the last one or first one both are the same.

enter image description here

I tried this using pivot table but its only returns the first records with this.

df.pivot_table(index=['Territory_id','client_id','patient_id','Total Clinic','Clinic Number'],
              columns='Attribute.2',
              values='Value',
              aggfunc='first'
              )

Need help.

Thanks


Solution

  • Try reset_index at the end:

    df.pivot_table(index=['Territory_id','client_id','patient_id','Total Clinic','Clinic Number'],
                  columns='Attribute.2',
                  values='Value',
                  aggfunc='first'
                  ).reset_index()
    

    Output:

    Attribute.2  Territory_id  client_id  patient_id  Total Clinic Clinic Number       Age Amount Days Last Clinic Product Qty Service Date
    0                      43        172        6021             1      Clinic 1  11 y 4 m  80.63                0       A   1   06/22/2017
    1                      43        172        6137             1      Clinic 1       7 y  80.63                0       B   1   06/22/2017
    2                      43        187        5658             5      Clinic 1   9 y 1 m      0                0       C   1   06/07/2017
    3                      43        187        5658             5      Clinic 2   9 y 1 m  52.48               23       D   2   06/30/2017
    4                      43        187        5658             5      Clinic 3   9 y 4 m  78.72               74       E   3   09/12/2017
    5                      43        187        5658             5      Clinic 4  10 y 4 m  53.72              358       F   2   09/05/2018