Search code examples
jsonparsingsplitdata-partitioning

Split large hash-structured JSON file into multiple smaller files


I am working with a very large JSON file, that has a hash-like structure:

{
  "1893": {
    "foo": {
      "2600": {
        ...[snip]...
      },
      "3520": {
        ...[snip]...
      }
    }
    "id": "foobar"
  },
  "123": {
    "bar": {
      "4989": {
        ...[snip]...
      },
      "0098": {
        ...[snip]...
      }
    }
    "id": "foobaz"
  },
  ...[snip]...
  "5553": {
    "baz: {
      "2600": {
        ...[snip]...
      },
      "3520": {
        ...[snip]...
      }
    }
    "id": "bazqux"
  }
}

(This file is similar to Stripe's migration mapping file)

I would like to split this file into multiple smaller ones, which are, obviously, valid JSON files. Since the "root" is a hash, I don't really care how this file is split, as long as the resulting files have an approx equal number of items.

I tried looking at JQ, but I don't seem to get a grasp on how to properly achieve this. Would appreciate any guidance towards a working JQ solution, or any other tools that can help in this.


Solution

  • I've managed to cook something up using JQ:

    • after checking the number of items the $file has:

      jq -c "length" < $file
      
    • we build up indices for slicing ($from and $to), then save the slice to $output:

      jq -c "to_entries[$from:$to] | from_entries" < $file > $output