Search code examples
c#dataframedeedle

How to flatten an object with nested object and nested collection into Deedle dataframe?


I have following class definitions:

public class SomeObject {

  public string Id { get; set; }

  public string Name { get; set; }

  public SomeOtherObject SomeOtherObject { get; set; }

  public SomeAnotherObject[] SomeAnotherObjectArr { get; set; }

}

public class SomeOtherObject {

  public string OtherObjectName { get; set; }

  //other properties omitted for brevity 

}

public class SomeAnotherObject {
  
  public string AnotherObjectName { get; set; }

  public bool Flag { get; set; }

}

I am reading a json file which deserialises to SomeObject. The target is to achieve a data frame that looks like:

Id   Name   OtherObjectName   AnotherObjectName   Flag
1    Name1  OtherObjectName1  AnotherObjectName1  false
1    Name1  OtherObjectName1  AnotherObjectName2  true

The code which I tried is:

SomeObject someObject = GetDeserialisedJson();
var df = Frame.FromRecords(new [] { someObject });
df.Print();

The output it prints is:

Id   Name    SomeOtherObject                                  SomeAnotherObjectArr
1    Name1   SomeOtherObject { OtherObjectName = someValue }  Model.SomeAnotherObject[]

Basically, the nested object is not flattened automatically and in case of nested array it just prints the namespace.classname[]

Till the moment the object has a simple structure with primitives the things are fine. How to achieve the required data frame structure in my case? I am an absolute beginner to this paradigm, so any alternate approaches or suggestions are welcome.


Solution

  • There is ExpandColumns operation on a data frame, which solves a part of your problem. The operation expands all columns which contain objects into multiple columns containing the properties of those objects:

    // Argument indicates how deep this should go
    var expanded = df.Expand(1)
    

    This will expand SomeOtherObject into SomeOtherObject.OtherObjectName, but this does not deal with arrays (it does not turn a single row into multiple rows).

    For arrays, I don't think there is any good built-in solution (other than just looking at the raw data and manipulating that). So my recommendation would probably be to use some other tool to turn your JSON data into CSV first and then load that in Deedle.