Search code examples
jsonterraformaws-glue

jsondecode function in Terraform is not reading the columns in the same order specified in JSON file


I have a JSON file emp.json something like this:

{
    "columns": {
        "%KEY_EMP_ID": "string",
        "EMP_DEP_ID": "string",
        "EMP_FULLNAME": "string",
        "EMP_STAFFNUMBER": "string",
        "EMP_EMAIL_ADDRESS": "string",
        "EMP_LOCATION_NAME": "string",
        "EMP_GRADE_NAME": "string",
        "EMP_GRADE_BAND": "string",
        "EMP_SAL": "string"
    }
}

I am using Terraform resource glue table to create a glue table:

resource "aws_glue_catalog_table" "aws_glue_catalog_table" {
  name          = "emp_table"
  database_name = "emp_db"
  table_type    = "EXTERNAL_TABLE"

  parameters = {
    EXTERNAL              = "TRUE"
    "parquet.compression" = "SNAPPY"
  }

  storage_descriptor {
    location      = "s3://my-bucket-emp/output/emp-stream"
    input_format  = "org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat"
    output_format = "org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat"

    ser_de_info {
      serialization_library = "org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe"
      parameters = {
        "serialization.format" = 1
      }
    }

    dynamic "columns" {
      for_each = jsondecode(file("${local.glue_config_path}/emp.json")).columns

      content {
        name = columns.key
        type = columns.value
      }
    }
  }
}

When I run this code, it's not creating the glue table columns as order specified in emp.json file. How to get the columns in the same order as in JSON file?


Solution

  • When you use jsondecode with a JSON object, Terraform interprets the JSON object as an object type in Terraform's type system.

    Terraform object types do not retain any information about the order of their attributes, so jsondecode has no option but to discard that information during decoding. This is common with JSON decoding in many programming languages, since the dictionary/map/object types in many programming languages do not preserve the order of elements/attributes/properties/etc.

    If you want to preserve ordering during jsondecode then you will need to use a JSON array instead of a JSON object, which Terraform will then map to a tuple type that can preserve the order:

    {
      "columns": [
        {"name":"EMP_KEY_ID","type":"string"},
        {"name":"EMP_DEP_ID","type":"string"},
        (...etc...)
      ]
    }
    
        dynamic "columns" {
          for_each = jsondecode(file("${local.glue_config_path}/emp.json")).columns
    
          content {
            name = columns.value.name
            type = columns.value.type
          }
        }