Search code examples
f#deedle

Rearrange rows and columns in Deedle


I have a table like

 Month Cluster Year ActualAmount TargetedAmount 
 1     1       2015 100          200            
 1     1       2016 300          400            
 1     1       2017 300          400            
 2     1       2015 500          600            
 2     2       2016 700          800  

and I would like to have the row values of year as columns like

 Month Cluster ActualAmount.2015 ActualAmount.2016 ActualAmount.2017 TargetedAmount.2015 ...
 1     1        100          300          300          200 ...             
 2     1        500            -            -          600 ...
...

I have tried to solve it with pivotTable (see below). It does not get the index right.

#r "nuget: Deedle"

open System
open Deedle

type Record =
    { Month: int
      Cluster: int
      Year: int
      ActualAmount: int
      TargetedAmount: int }

let Records =
    [ { Month = 1
        Cluster = 1
        Year = 2015
        ActualAmount = 100
        TargetedAmount = 200 }
      { Month = 1
        Cluster = 1
        Year = 2016
        ActualAmount = 300
        TargetedAmount = 400 }
      { Month = 1
        Cluster = 1
        Year = 2017
        ActualAmount = 300
        TargetedAmount = 400 }
      { Month = 2
        Cluster = 1
        Year = 2015
        ActualAmount = 500
        TargetedAmount = 600 }
      { Month = 2
        Cluster = 2
        Year = 2016
        ActualAmount = 700
        TargetedAmount = 800 } ]

let df = Frame.ofRecords Records

df.Print()

let pdf = df |> Frame.pivotTable (fun k r -> r.GetAs<int>("Month")) (fun k r -> r.GetAs<int>("Year")) id
    
pdf.Print()
     2015                                       2016                                       2017                                       
1 -> Deedle.Frame`2[System.Int32,System.String] Deedle.Frame`2[System.Int32,System.String] Deedle.Frame`2[System.Int32,System.String] 
2 -> Deedle.Frame`2[System.Int32,System.String] Deedle.Frame`2[System.Int32,System.String] <missing>                                      

Any help is appreciated.


Solution

  • I'm not a Deedle expert, but this seems to work:

    let pivot col df =
        df
            |> Frame.pivotTable
                (fun k r -> r.GetAs<int>("Month"), r.GetAs<int>("Cluster"))
                (fun k r -> r.GetAs<int>("Year"))
                (fun frm -> frm.GetColumn(col).Sum().ToString())
            |> Frame.fillMissingWith "-"
            |> Frame.mapColKeys (fun c -> $"{col}.{c}")
    let actualAmounts =
        df |> pivot "ActualAmount"
    let targetedAmounts =
        df |> pivot "TargetedAmount"
    
    let monthClusters =
        actualAmounts
            |> Frame.mapRows (fun (month, cluster) _ ->
                [
                    "Month", month
                    "Cluster", cluster
                ] |> Series.ofObservations)
            |> Frame.ofRows
    
    let pdf = monthClusters.Join(actualAmounts).Join(targetedAmounts)
    pdf.Print()
    

    Output is:

           Month Cluster ActualAmount.2015 ActualAmount.2016 ActualAmount.2017 TargetedAmount.2015 TargetedAmount.2016 TargetedAmount.2017
    1 1 -> 1     1       100               300               300               200                 400                 400
    2 1 -> 2     1       500               -                 -                 600                 -                   -
      2 -> 2     2       -                 700               -                 -                   800                 -
    

    The trick is to compute separate pivot tables for actual and targeted amounts, and then join them together.