Search code examples
f#deedle

F#, Deedle and OptionalValue: Object must implement IConvertible error


I'm facing trouble when I try to create missing values in a Frame and later perform operations with them. Here is a "working" sample:

open Deedle
open System.Text.RegularExpressions

do fsi.AddPrinter(fun (printer:Deedle.Internal.IFsiFormattable) -> "\n" + (printer.Format()))

module Frame = let mapAddCol col f frame = frame |> Frame.addCol col (Frame.mapRowValues f frame)

[   {|Desc = "A - 1.50ml"; ``Price ($)`` = 23.|}
    {|Desc = "B - 2ml"; ``Price ($)`` = 18.5|}
    {|Desc = "C"; ``Price ($)`` = 25.|}             ]
|> Frame.ofRecords
(*
     Desc       Price ($) 
0 -> A - 1.50ml 23        
1 -> B - 2ml    18.5      
2 -> C          25        
*)
|> Frame.mapAddCol "Volume (ml)" (fun row ->
    match Regex.Match(row.GetAs<string>("Desc"),"[\d\.]+").Value with
    | "" -> OptionalValue.Missing
    | n -> n |> float |> OptionalValue)
(* 
     Desc       Price ($) Volume (ml) 
0 -> A - 1.50ml 23        1.5         
1 -> B - 2ml    18.5      2           
2 -> C          25        <missing>   
*)
|> fun df -> df?``Price ($/ml)`` <- df?``Price ($)`` / df?``Volume (ml)``
//error message: System.InvalidCastException: Object must implement IConvertible.

What is wrong with this approach?


Solution

  • Deedle internally stores a flag whether a value is present or missing. This is typically exposed via the OptionalValue type, but the internal representation is not actually using this type.

    When you use a function such as mapRowValues to generate new data, Deedle needs to recognize which data is missing. This happens in only somewhat limited cases only. When you return OptionalValue<float>, Deedle actually produces a series where the type of values is OptionalValue<float> rather than float (the type system does not let it do anything else).

    For float values, the solution is just to return nan as your missing value:

    |> Frame.mapAddCol "Volume (ml)" (fun row ->
        match Regex.Match(row.GetAs<string>("Desc"),"[\d\.]+").Value with
        | "" -> nan
        | n -> n |> float )
    

    This will create a new series of float values, which you can then access using the ? operator.