Search code examples
javajsoncsvjsonpath

How to select fields in different levels of a jsonfile with jsonPath?


I want to convert jsonobjcts into csv files. Wy (working) attempt so far is to load the json file as a JSONObject (from the googlecode.josn-simple library), then converting them with jsonPath into a string array which is then used to build the csv rows. However I am facing a problem with jsonPath. From the given example json...

{
"issues": [
    {
        "key": "abc",
        "fields": {
            "issuetype": {
                "name": "Bug",
                "id": "1",
                "subtask": false
            },
            "priority": {
                "name": "Major",
                "id": "3"
            },
            "created": "2020-5-11",
            "status": {
                "name": "OPEN"
            }
        }
    },
    {
        "key": "def",
        "fields": {
            "issuetype": {
                "name": "Info",
                "id": "5",
                "subtask": false
            },
            "priority": {
                "name": "Minor",
                "id": "2"
            },
            "created": "2020-5-8",
            "status": {
                "name": "DONE"
            }
        }
    }
]}

I want to select the following:

[
    "abc",
    "Bug",
    "Major",
    "2020-5-11",
    "OPEN",
    "def",
    "Info",
    "Minor",
    "2020-5-8",
    "DONE"
]

The csv should look like that:

abc,Bug,Major,2020-5-11,OPEN
def,Info,Minor,2020-5-8,DONE

I tried $.issues.[*].[key,fields] and I get

  "abc",
  {
    "issuetype": {
      "name": "Bug",
      "id": "1",
      "subtask": false
    },
    "priority": {
      "name": "Major",
      "id": "3"
    },
    "created": "2020-5-11",
    "status": {
      "name": "OPEN"
    }
  },
  "def",
  {
    "issuetype": {
      "name": "Info",
      "id": "5",
      "subtask": false
    },
    "priority": {
      "name": "Minor",
      "id": "2"
    },
    "created": "2020-5-8",
    "status": {
      "name": "DONE"
    }
  }
]

But when I want to select e.g. only "created" $.issues.[*].[key,fields.[created]

[
  "2020-5-11",
  "2020-5-8"
]

This is the result.

But I just do not get how to select "key" and e.g. "name" in the field issuetype. How do I do that with jsonPath or is there a better way to filter a jsonfile and then convert it into a csv?


Solution

  • I recommend what I believe is a better way - which is to create a set of Java classes which represent the structure of your JSON data. When you read the JSON into these classes, you can manipulate the data using standard Java.

    I also recommend a different JSON parser - in this case Jackson, but there are others. Why? Mainly, familiarity - see later on for more notes on that.

    Starting with the end result: Assuming I have a class called Container which contains all the issues listed in the JSON file, I can then populate it with the following:

    //import com.fasterxml.jackson.databind.ObjectMapper;
    
    String jsonString = "{...}" // your JSON data as a string, for this demo.
    ObjectMapper objectMapper = new ObjectMapper();
    Container container = objectMapper.readValue(jsonString, Container.class);
    

    Now I can print out all the issues in the CSV format you want as follows:

    container.getIssues().forEach((issue) -> {
        printCsvRow(issue);
    });
    

    Here, the printCsvRow() method looks like this:

    private void printCsvRow(Issue issue) {
        String key = issue.getKey();
        Fields fields = issue.getFields();
        String type = fields.getIssuetype().getName();
        String priority = fields.getPriority().getName();
        String created = fields.getCreated();
        String status = fields.getStatus().getName();
        System.out.println(String.join(",", key, type, priority, created, status));
    }
    

    In reality, I would use a CSV library to ensure records are formatted correctly - the above is just for illustration, to show how the JSON data can be accessed.

    The following is printed:

    abc,Bug,Major,2020-5-11,OPEN
    def,Info,Minor,2020-5-8,DONE
    

    And to filter only OPEN records, I can do something like this:

    container.getIssues()
            .stream()
            .filter(issue -> issue.getFields().getStatus().getName().equals("OPEN"))
            .forEach((issue) -> {
        printCsvRow(issue);
    });
    

    The following is printed:

    abc,Bug,Major,2020-5-11,OPEN
    

    To enable Jackson, I use Maven with the following dependency:

    <dependency>
        <groupId>com.fasterxml.jackson.core</groupId>
        <artifactId>jackson-databind</artifactId>
        <version>2.10.3</version>
    </dependency>
    

    In case you don't use Maven, this gives me 3 JARs: jackson-databind, jackson-annotations, and jackson-core.

    To create the nested Java classes I need (to mirror the structure of the JSON), I use a tool which generates them for me using your sample JSON.

    In my case, I used this tool, but there are others.

    I chose "Container" as the name of the root Java class; a source type of JSON; and selected Jackson 2.x annotations. I also requested getters and setters.

    I added the generated classes (Fields, Issue, Issuetype, Priority, Status, and Container) to my project.

    WARNING: The completeness of these Java classes is only as good as the sample JSON. But you can, of course, enhance these classes to more accurately reflect the actual JSON you need to handle.

    The Jackson ObjectMapper takes care of loading the JSON into the class structure.

    I chose to use Jackson instead of JsonPath, simply because of familiarity. JsonPath appears to have very similar object mapping capabilities - but I have never used those features of JsonPath.

    Final note: You can use xpath style predicates in JsonPath to access individual data items and groups of items - as you describe in your question. But (in my experience) it is almost always worth the extra effort to create Java classes, if you want to process all your data in more flexible ways - especially if that involves transforming the JSON input into different output structures.