Search code examples

Converting Json to CSV using command line(JQ,json2csv, Python, other)

I am wanting to write a script to

  1. fetch information then returning Json file
  2. filter Json file
  3. then converting that Json to CSV.

I have figured out steps 1 and 2, but am stuck on steps 3. Currently I have to use an online Json to CSV converter to get the desired output.

The Online Json to CSV tool uses python for users to connect to it's API to use the conversation tool. Possibly means that the tool itself is a python module.

Json file to convert

  "clubs": {
    "39335": {
      "details": {
        "name":"Team one",
    "111655": {
      "details": {
        "name":"Team two",
  "players": {
    "39335": {
      "189908959": {
        "playername":"player one"
      "828715674": {
        "playername":"player two"
    "111655": {
      "515447555": {
        "playername":"player three"
      "806370074": {
        "playername":"player four"

Desired output csv code

"2068447050405","1658361314","39335","486","Team one","39335","39335","189908959","defenseMen","3600","player one"
"2068447050405","1658361314","111655","229","Team two","111655","39335","828715674","rightWing","3600","player two"
"2068447050405","1658361314","","","","","111655","515447555","defenseMen","3600","player three"
"2068447050405","1658361314","","","","","111655","806370074","center","3600","player four"

How it looks in a spreadsheet Sheet example

Some believe the filter is having an effect on how the csv out put is formatted, here is a link to the full json file and csv output of that file. Code is to long to post on this page.

Original JSON before filter Original JSON

CSV output of original JSON file CSV output

Edit I should have mentioned this, The "Jason file to convert is only a small sample of the actual Json I wish to convert. I assumed I would be able to simple add to the code used to answer, I was wrong. The Json I intend to use has 9 total columns for clubs and 52 columns for Players.


  • I'm working hard to really grok jq, so here you go: with no explanation:

    jq -r '
        | [.matchId, .timestamp] as [$matchId, $timestamp]
        | (.players | [to_entries[] | .key as $id1 | .value | to_entries[] | [$id1, .key, .value.position, .value.toiseconds, .value.playername]]) as $players
        | (.clubs | [to_entries[] | [.key, .value.toa,, .value.details.clubId]]) as $clubs
        | range([$players, $clubs] | map(length) | max)
        | [$matchId, $timestamp] + ($clubs[.] // ["","","",""]) + ($players[.] // ["","","","",""])
        | @csv
    ' file.json
    "2068447050405",1658361314,"39335","486","Team one",39335,"39335","189908959","defenseMen","3600","player one"
    "2068447050405",1658361314,"111655","229","Team two",111655,"39335","828715674","rightWing","3600","player two"
    "2068447050405",1658361314,"","","","","111655","515447555","defenseMen","3600","player three"
    "2068447050405",1658361314,"","","","","111655","806370074","center","3600","player four"

    The default value arrays of empty strings needs to be the same size as the amount of "real" data you're grabbing.

    Since this is a PITA to keep aligned, an update:

    jq -r '
        def empty_strings: reduce range(length) as $i ([]; . + [""]);
        | [.matchId, .timestamp] as [$matchId, $timestamp]
        | (.players | [to_entries[] | .key as $id1 | .value | to_entries[] | [$id1, .key, .value.position, .value.toiseconds, .value.playername]]) as $players
        | (.clubs | [to_entries[] | [.key, .value.toa,, .value.details.clubId]]) as $clubs
        | range([$players, $clubs] | map(length) | max)
        | [$matchId, $timestamp]
          + ($clubs[.]   // ($clubs[0] | empty_strings))
          + ($players[.] // ($players[0] | empty_strings))
        | @csv
    ' file.json