When I run the code below I get a DataFrame
with one bool
column and two double
columns. However, when I extract the bool
column as a Series the result is a Series object with types DateTime
and float
.
It looks like Deedle
"cast" the column to another type.
Why is this happening?
open Deedle
let dates =
[ DateTime(2013,1,1);
DateTime(2013,1,4);
DateTime(2013,1,8) ]
let values = [ 10.0; 20.0; 30.0 ]
let values2 = [ 0.0; -1.0; 1.0 ]
let first = Series(dates, values)
let second = Series(dates, values2)
let third: Series<DateTime,bool> = Series.map (fun k v -> v > 0.0) second
let df1 = Frame(["first"; "second"; "third"], [first; second; third])
let sb = df1.["third"]
df1;;
val it : Frame<DateTime,string> =
Deedle.Frame`2[System.DateTime,System.String]
{ColumnCount = 3;
ColumnIndex = Deedle.Indices.Linear.LinearIndex`1[System.String];
ColumnKeys = seq ["first"; "second"; "third"];
ColumnTypes = seq [System.Double; System.Double; System.Boolean];
...
sb;;
val it : Series<DateTime,float> = ...
As the existing answer points out, GetColumn
is the way to go. You can specify the generic parameter directly when calling GetColumn
and avoid the type annotation to make the code nicer:
let sb = df1.GetColumn<bool>("third")
Deedle frame does not statically keep track of the types of the columns, so when you want to get a column as a typed series, you need to specify the type in some way.
We did not want to force people to write type annotations, because they tend to be quite long and ugly, so the primary way of getting a column is GetColumn
where you can specify the type argument as in the above example.
The other ways of accessing column such as df?third
and df.["third"]
are shorthands that assume the column type to be float
because that happens to be quite common scenario (at least for the most common uses of Deedle in finance), so these two notations give you a simpler way that "often works nicely".