Python's pandas
library allows getting info()
on a data frame.
For example.
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 30 entries, 0 to 29
Data columns (total 9 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Name 30 non-null object
1 PhoneNumber 30 non-null object
2 City 30 non-null object
3 Address 30 non-null object
4 PostalCode 30 non-null object
5 BirthDate 30 non-null object
6 Income 26 non-null float64
7 CreditLimit 30 non-null object
8 MaritalStatus 24 non-null object
dtypes: float64(1), object(8)
memory usage: 2.2+ KB
Is there an equivalent in Deedle's data frame? Something that can get an overview for missing values and the inferred types.
There isn't a single function to do this - it would be a nice addition to the library if you wanted to consider sending a pull-request.
The following gets all the information you would need:
// Prints column names and types, with data preview
df.Print(true)
// Print key range of rows (or key sequence if it is not ordered)
if df.RowIndex.IsOrdered then printfn "%A" df.RowIndex.KeyRange
else printfn "%A" df.RowIndex.Keys
// Get access to the data of the frame so that we can inspect the columns
let dt = df.GetFrameData()
for n, (ty, vec) in Seq.zip dt.ColumnKeys dt.Columns do
// Print name, type of column
printf "%A %A" n ty
// Query the interal data storage to see if it uses
// array of optional values (may have nulls) or not
match vec.Data with
| Vectors.VectorData.DenseList _ -> printfn " (no nulls)"
| _ -> printfn " (nulls)"