Search code examples
c#dataframedeedle

C# Deedle Dataframe ReplaceColumn <MIssing> values


I'm having a difficult time trying to figure this one out. First off I'm completely new to Deedle so forgive me if I ask a dumb question here.

So I have a parent Frame that I'm filtering resulting in a sub frame

var subFrame = parentFrame.Where(kvp => kvp.Value.GetAs<string>("ColA") == "ValueA" && kvp.Value.GetAs<string>("ColB") == "ValueB");

var subCol = subFrame.GetColumn<decimal>("ColDecimal1");
subCol = subFrame.GetColumn<decimal>("ColDecimal2");

parentFrame.ReplaceColumn("ColDecimal1", subCol);

When I do this it almost gives me my desired results. The column values do align with the key values of the parent row however, the keys that did not match what was initially fetch via the subFrame/filtering have a "ColDecimal1" value of . I know this is what is stated in the Deedle documentation but I'm trying to find a work around to this. My preference is possibly getting some examples in C# vs F#, F# is a bit foreign to me so it's been difficult to follow.

Anways, thank you in advance.


Solution

  • The subFrame only has data for some of the rows (as specified by your filter). ReplaceColumn will replace a column, adding <Missing> for all row keys for which the new column does not have values, so that is why you're seeing what you are seeing.

    If you would like to get a result with the new values for rows that were selected by the filter, but the old value for all other rows, you can do this using Zip and Select.

    For example, say I have two series:

    let parentCol = series [1 => 1.0; 2 => 2.0; 3 => 3.0]
    let subCol = series [1 => 1.1 ]
    

    The subCol has data just for key 1. Now, I want to replace the value 1.0 in parentCol with the value from subCol to do this in F#, you'd write (assuming there are no missing values in parentCol):

    parentCol.Zip(subCol, JoinKind.Left).SelectValues(fun (v1, v2) ->
      if v2.HasValue then v2.Value else v1.Value
    )
    

    The only thing different in C# is that you access tuple elements using Item1 and Item2 and a different lambda syntax. I did not test this, but it should be:

    parentCol.Zip(subCol, JoinKind.Left).SelectValues(v =>
      v.Item2.HasValue ? v.Item2.Value : v.Item1.Value);