I have a table like
Month Cluster Year ActualAmount TargetedAmount
1 1 2015 100 200
1 1 2016 300 400
1 1 2017 300 400
2 1 2015 500 600
2 2 2016 700 800
and I would like to have the row values of year as columns like
Month Cluster ActualAmount.2015 ActualAmount.2016 ActualAmount.2017 TargetedAmount.2015 ...
1 1 100 300 300 200 ...
2 1 500 - - 600 ...
...
I have tried to solve it with pivotTable
(see below). It does not get the index right.
#r "nuget: Deedle"
open System
open Deedle
type Record =
{ Month: int
Cluster: int
Year: int
ActualAmount: int
TargetedAmount: int }
let Records =
[ { Month = 1
Cluster = 1
Year = 2015
ActualAmount = 100
TargetedAmount = 200 }
{ Month = 1
Cluster = 1
Year = 2016
ActualAmount = 300
TargetedAmount = 400 }
{ Month = 1
Cluster = 1
Year = 2017
ActualAmount = 300
TargetedAmount = 400 }
{ Month = 2
Cluster = 1
Year = 2015
ActualAmount = 500
TargetedAmount = 600 }
{ Month = 2
Cluster = 2
Year = 2016
ActualAmount = 700
TargetedAmount = 800 } ]
let df = Frame.ofRecords Records
df.Print()
let pdf = df |> Frame.pivotTable (fun k r -> r.GetAs<int>("Month")) (fun k r -> r.GetAs<int>("Year")) id
pdf.Print()
2015 2016 2017
1 -> Deedle.Frame`2[System.Int32,System.String] Deedle.Frame`2[System.Int32,System.String] Deedle.Frame`2[System.Int32,System.String]
2 -> Deedle.Frame`2[System.Int32,System.String] Deedle.Frame`2[System.Int32,System.String] <missing>
Any help is appreciated.
I'm not a Deedle expert, but this seems to work:
let pivot col df =
df
|> Frame.pivotTable
(fun k r -> r.GetAs<int>("Month"), r.GetAs<int>("Cluster"))
(fun k r -> r.GetAs<int>("Year"))
(fun frm -> frm.GetColumn(col).Sum().ToString())
|> Frame.fillMissingWith "-"
|> Frame.mapColKeys (fun c -> $"{col}.{c}")
let actualAmounts =
df |> pivot "ActualAmount"
let targetedAmounts =
df |> pivot "TargetedAmount"
let monthClusters =
actualAmounts
|> Frame.mapRows (fun (month, cluster) _ ->
[
"Month", month
"Cluster", cluster
] |> Series.ofObservations)
|> Frame.ofRows
let pdf = monthClusters.Join(actualAmounts).Join(targetedAmounts)
pdf.Print()
Output is:
Month Cluster ActualAmount.2015 ActualAmount.2016 ActualAmount.2017 TargetedAmount.2015 TargetedAmount.2016 TargetedAmount.2017
1 1 -> 1 1 100 300 300 200 400 400
2 1 -> 2 1 500 - - 600 - -
2 -> 2 2 - 700 - - 800 -
The trick is to compute separate pivot tables for actual and targeted amounts, and then join them together.