Search code examples
swiftcoremlcreateml

How to define / change MLDataValue.ValueType for a column in MLDataTable


I am loading a MLDataTable from a given .csv file. The data type for each column is inferred automatically depending on the content of the input file.
I need predictable, explicit types when I process the table later.

How can I enforce a certain type when loading a file or alternatively change the type in a second step?

Simplified Example:

import Foundation
import CreateML

// file.csv:
//
// value1,value2
// 1.5,1

let table = try MLDataTable(contentsOf:URL(fileURLWithPath:"/path/to/file.csv"))
print(table.columnTypes)

// actual output:  
// ["value2": Int, "value1": Double]       <--- type for value2 is 'Int'
//
// wanted output:  
// ["value2": Double, "value1": Double]    <--- how can I make it 'Double'?

Solution

  • Use MLDataColumn's map(to:) method to derive a new column from the existing one with the desired underlying type:

    let squaresArrayInt = (1...5).map{$0 * $0}
    var table = try! MLDataTable(dictionary: ["Ints" :  squaresArrayInt])
    print(table)
    
    let squaresColumnDouble = table["Ints"].map(to: Double.self)
    table.addColumn(squaresColumnDouble, named: "Doubles")
    print(table)
    

    Produces the following output:

    Columns:
        Ints    integer
    Rows: 5
    Data:
    +----------------+
    | Ints           |
    +----------------+
    | 1              |
    | 4              |
    | 9              |
    | 16             |
    | 25             |
    +----------------+
    [5 rows x 1 columns]
    
    
    Columns:
        Ints    integer
        Doubles float
    Rows: 5
    Data:
    +----------------+----------------+
    | Ints           | Doubles        |
    +----------------+----------------+
    | 1              | 1              |
    | 4              | 4              |
    | 9              | 9              |
    | 16             | 16             |
    | 25             | 25             |
    +----------------+----------------+
    [5 rows x 2 columns]