Search code examples
jsonswiftjson-deserializationjsondecoder

Parsing complex JSON where data and "column headers" are in separate arrays


I have the following JSON data I get from an API:

{"datatable": 
  {"data" : [
    ["John", "Doe", "1990-01-01", "Chicago"], 
    ["Jane", "Doe", "2000-01-01", "San Diego"]
  ], 
  "columns": [
    { "name": "First", "type": "String" }, 
    { "name": "Last", "type": "String" },
    { "name": "Birthday", "type": "Date" }, 
    { "name": "City", "type": "String" }
  ]}
}

A later query could result the following:

{"datatable": 
  {"data" : [
    ["Chicago", "Doe", "John", "1990-01-01"], 
    ["San Diego", "Doe", "Jane", "2000-01-01"]
  ], 
  "columns": [
    { "name": "City", "type": "String" },
    { "name": "Last", "type": "String" },
    { "name": "First", "type": "String" }, 
    { "name": "Birthday", "type": "Date" }
  ]
  }
}

The order of the colums seems to be fluid.

I initially wanted to decode the JSON with JSONDecoder, but for that I need the data array to be a dictionary and not an array. The only other method I could think of was to convert the result to a dictionary with something like:

extension String {
    func convertToDictionary() -> [String: Any]? {
        if let data = data(using: .utf8) {
            return try? JSONSerialization.jsonObject(with: data, options: []) as? [String: Any]
        }
        return nil
    }
}

This will cause me however to have a lot of nested if let statements like if let x = dictOfStr["datatable"] as? [String: Any] { ... }. Not to mention the subsequent looping through the columns array to organize the data.

Is there a better solution? Thanks


Solution

  • You could still use JSONDecoder, but you'd need to manually decode the data array.

    To do that, you'd need to read the columns array, and then decode the data array using the ordering that you got from the columns array.

    This is actually a nice use case for KeyPaths. You can create a mapping of columns to object properties, and this helps avoid a large switch statement.

    So here's the setup:

    struct DataRow {
      var first, last, city: String?
      var birthday: Date?
    }
    
    struct DataTable: Decodable {
    
      var data: [DataRow] = []
    
      // coding key for root level
      private enum RootKeys: CodingKey { case datatable }
    
      // coding key for columns and data
      private enum CodingKeys: CodingKey { case data, columns }
    
      // mapping of json fields to properties
      private let fields: [String: PartialKeyPath<DataRow>] = [
         "First":    \DataRow.first,
         "Last":     \DataRow.last,
         "City":     \DataRow.city,
         "Birthday": \DataRow.birthday ]
    
      // I'm actually ignoring here the type property in JSON
      private struct Column: Decodable { let name: String }
    
      // init ...
    }
    

    Now the init function:

    init(from decoder: Decoder) throws {
       let root = try decoder.container(keyedBy: RootKeys.self)
       let inner = try root.nestedContainer(keyedBy: CodingKeys.self, forKey: .datatable)
    
       let columns = try inner.decode([Column].self, forKey: .columns)
    
       // for data, there's more work to do
       var data = try inner.nestedUnkeyedContainer(forKey: .data)
    
       // for each data row
       while !data.isAtEnd {
          let values = try data.decode([String].self)
    
          var dataRow = DataRow()
    
          // decode each property
          for idx in 0..<values.count {
             let keyPath = fields[columns[idx].name]
             let value = values[idx]
    
             // now need to decode a string value into the correct type
             switch keyPath {
             case let kp as WritableKeyPath<DataRow, String?>:
                dataRow[keyPath: kp] = value
             case let kp as WritableKeyPath<DataRow, Date?>:
                let dateFormatter = DateFormatter()
                dateFormatter.dateFormat = "YYYY-MM-DD"
                dataRow[keyPath: kp] = dateFormatter.date(from: value)
             default: break
             }
          }
    
          self.data.append(dataRow)
       }
    }
    

    To use this, you'd use the normal JSONDecode way:

    let jsonDecoder = JSONDecoder()
    let dataTable = try jsonDecoder.decode(DataTable.self, from: jsonData)
    
    print(dataTable.data[0].first) // prints John
    print(dataTable.data[0].birthday) // prints 1990-01-01 05:00:00 +0000
    

    EDIT

    The code above assumes that all the values in a JSON array are strings and tries to do decode([String].self). If you can't make that assumption, you could decode the values to their underlying primitive types supported by JSON (number, string, bool, or null). It would look something like this:

    enum JSONVal: Decodable {
      case string(String), number(Double), bool(Bool), null, unknown
    
      init(from decoder: Decoder) throws {
         let container = try decoder.singleValueContainer()
    
         if let v = try? container.decode(String.self) {
           self = .string(v)
         } else if let v = try? container.decode(Double.self) {
           self = .number(v)
         } else if ...
           // and so on, for null and bool
      }
    }
    

    Then, in the code above, decode the array into these values:

    let values = try data.decode([JSONValue].self)
    

    Later when you need to use the value, you can examine the underlying value and decide what to do:

    case let kp as WritableKeyPath<DataRow, Int?>:
      switch value {
        case number(let v):
           // e.g. round the number and cast to Int
           dataRow[keyPath: kp] = Int(v.rounded())
        case string(let v):
           // e.g. attempt to convert string to Int
           dataRow[keyPath: kp] = Int((Double(str) ?? 0.0).rounded())
        default: break
      }