Search code examples
f#f#-interactivedeedledotnet-interactive

What is the equivalent to pandas dataframe info() in Deedle?


Python's pandas library allows getting info() on a data frame.

For example.

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 30 entries, 0 to 29
Data columns (total 9 columns):
 #   Column         Non-Null Count  Dtype  
---  ------         --------------  -----  
 0   Name           30 non-null     object 
 1   PhoneNumber    30 non-null     object 
 2   City           30 non-null     object 
 3   Address        30 non-null     object 
 4   PostalCode     30 non-null     object 
 5   BirthDate      30 non-null     object 
 6   Income         26 non-null     float64
 7   CreditLimit    30 non-null     object 
 8   MaritalStatus  24 non-null     object 
dtypes: float64(1), object(8)
memory usage: 2.2+ KB

Is there an equivalent in Deedle's data frame? Something that can get an overview for missing values and the inferred types.


Solution

  • There isn't a single function to do this - it would be a nice addition to the library if you wanted to consider sending a pull-request.

    The following gets all the information you would need:

    // Prints column names and types, with data preview
    df.Print(true)
    
    // Print key range of rows (or key sequence if it is not ordered)
    if df.RowIndex.IsOrdered then printfn "%A" df.RowIndex.KeyRange
    else printfn "%A" df.RowIndex.Keys
    
    // Get access to the data of the frame so that we can inspect the columns
    let dt = df.GetFrameData()
    for n, (ty, vec) in Seq.zip dt.ColumnKeys dt.Columns do 
      // Print name, type of column
      printf "%A %A" n ty
      // Query the interal data storage to see if it uses
      // array of optional values (may have nulls) or not
      match vec.Data with 
      | Vectors.VectorData.DenseList _ -> printfn " (no nulls)"
      | _ -> printfn " (nulls)"