I have recently started to learn F# for Data Science (coming from simple C# and Python). I start to get used to the power of functional first paradigm for Science.
However, I am still confused on how to treat a problem I could easily fix using pandas in Python. It is related to Multi index time series / Data frame. I have extensively checked on Deedle but I am still not sure if Deedle could help me achieve such a table:
Column Index 1: A || B
Column Index 2: A1 A2 || B1 B2
Column Index 3: p1 p2 | p1 p2 || p1 p2 | p1 p2
Row Index:
date1 0.5 2. | 2. 0.5 || 3. 0. | 2. 3.
date2 ......
The idea being able to sum all p1 series when Index1 = A etc etc
I did not find example of such a thing using Deedle.
If it is not available, what structure for my data would you recommend me?
Thanks for helping a newbie (but in love with) in F#
In Deedle, you can create a frame or a series with hierarchical index by using a tuple as the key:
let ts =
series
[ ("A", "A1", "p1") => 0.5
("A", "A1", "p2") => 2.
("A", "A2", "p3") => 2.
("A", "A2", "p4") => 0.5 ]
Deedle does have some special handling for this. For example, it will output the data as:
A A1 p1 -> 0.5
p2 -> 2
A2 p3 -> 2
p4 -> 0.5
To apply aggregation over a part of the hierarchy, you can use the applyLevel
function:
ts |> Series.applyLevel (fun (l1, l2, l3) -> l1) Stats.mean
ts |> Series.applyLevel (fun (l1, l2, l3) -> l1, l2) Stats.mean
The first argument is a function that gets the tuple of keys and selects what part of the level you want to group - so the above two create an aggregation over the top and top two levels, respectively.