I am learning F#.
I am trying to convert a Map<string, seq<DateTime * float>>
to a Deedle dataframe (http://bluemountaincapital.github.io/Deedle/tutorial.html#creating).
I have prapared the following code:
let folderFnct (aFrame:Frame) colName datesAndValues =
let newSerie = Series(Seq.map (fun x -> fst x) datesAndValues, Seq.map (fun y -> snd y) datesAndValues)
let newFrame = aFrame.Join([colName], [newSerie], kind=JoinKind.Inner)
newFrame
let mapToDeedleFrame myMap frame =
Map.fold ( fun s ticker datesAndValues -> folderFnct s ticker datesAndValues) frame myMap
mapToDeedleFrame
folds the map using an existing frame. The folder function folderFnct
:
<DateTime * float>
) making a Series of them.The problem is with:
let newFrame = aFrame.Join([colName], [newSerie], kind=JoinKind.Inner)
where:
The field, constructor or member 'Join' is not defined
I have identified three potential causes of the issue:
aFrame.Join
not defined? I tried explicitly specifying the type of aFrame
mapToDeedleFrame
an empty frame?folderFnct
against the case where aFrame
is empty?Thanks a lot!
EDIT 1
Based on Tomas suggestion, this is what I have cranked out so far.
let folderFnct (aFrame:Frame<'a, 'b>) columnName (seqOfTuples: seq<'a*'b>) =
let newSerie = Series(Seq.map (fun x -> fst x) seqOfTuples, Seq.map (fun y -> snd y) seqOfTuples)
let otherFrame = Frame([columnName], [newSerie])
let newFrame = aFrame.Join((otherFrame), kind=JoinKind.Inner)
newFrame
let mapToDeedleFrame myMap frame =
Map.fold ( fun state k vals -> folderFnct state k vals) frame myMap
The last step missing is: how do I quickly pass an empty Frame (maybe avoiding creating a dummy one) to mapToDeedleFrame
? I have tried []
as in
let frame = mapToDeedleFrame mapTS []
This may be a silly question, but I am new to F# and I was wondering if there is an Empty
type built in the language.
FOLLOW UP QUESTION
In the source file I read (https://github.com/BlueMountainCapital/Deedle/blob/master/src/Deedle/Frame.fs):
member frame.Join<'V>(colKey, series:Series<'TRowKey, 'V>, kind, lookup) =
let otherFrame = Frame([colKey], [series])
frame.Join(otherFrame, kind, lookup)
while in the function description popping out on the screen:
From the picture above I would guess that the type of the Frame is the same as colKey, while, as I understood, colKey is just the key to the dataframe column added with the join from the serie. As a complete noob, I am quite confused..
EDIT 2
I have rewritten the code:
let seriesListMapper (colName:string, series:Series<'a, 'b>) =
[colName => series] |> frame
let frameListReducer (accFrame: Frame<'a, 'b>) (aFrame: Frame<'a, 'b>) =
accFrame.Join(aFrame, kind=JoinKind.Outer)
let seriesListToFrame (seriesList: List<string * Series<'a, 'b>>) =
seriesList |> List.map (fun elem -> seriesListMapper elem) |> List.reduce(fun acc elem -> frameListReducer acc elem)
The problem is that:
let frame = seriesListToFrame seriesList
returns frame as Frame, while seriesList is instead (string *Series<DateTime, float>) list
I think that the problem is with:
let seriesListMapper (colName:string, series:Series<'a, 'b>) =
[colName => series] |> frame
In fact seriesListMapper
is indicated as
seriesListMapper: colName:string * series:Series<'a, 'b> -> Frame<'a, string>
I do not understand how and why the values are converted to string
from float
.
One interesting thing is that plotting the frame with frame.Format()
actually confirms that the data looks correct. It is just this "strange" conversion to string
.
In the type annotation of the folderFnct
, you have aFrame:Frame
. However, the type representing data frames is a generic type with two type arguments (representing the type of index for rows and columns, respectively), so the annotation should be aFrame:Frame<_, _>
.
Another way to add series to a frame is to use mutating operation:
aFrame.AddSeries(colName, newSeries)
However, this only supports left join (data frame can only be mutated by adding new series, but not in a way that would change the index). However, you might be able to use this approach and then drop all missing values from the frame once it is constructed.
EDIT: To answer the question about generic types:
Series<K, V>
represents series with keys of type K
containing values of type V
(e.g. series with ordinarily indexed observations would have K=int
and V=float
)
Frame<R, C>
represents a frame that contains heterogeneous data (of potentially varying types for each column) where the rows are indexed by R
and columns are indexed by C
. For ordinarily indexed frame R=int
and typically, your columns will be named so C=string
(but you can have other indices too)