Search code examples
jsonsortingkeyjq

Sorting by nested integer value with jq


I have a JSON object of objects:

{
  "lungCancerCellLines": {
    "componentIndex": 11,
    "active": true,
    "longName": "Lung cancer cell lines",
    "filenameKey": "lungCancerCellLines",
    "color": "#8CC63F"
  },
  "naturalKillerCells": {
    "componentIndex": 1,
    "active": true,
    "longName": "Natural killer cells",
    "filenameKey": "naturalKillerCells",
    "color": "#BB2DD4"
  },
  "respiratoryMuscular": {
    "componentIndex": 12,
    "active": true,
    "longName": "Respiratory / muscular",
    "filenameKey": "respiratoryMuscular",
    "color": "#DDE223"
  },
  "cerebellarHemisphere": {
    "componentIndex": 14,
    "active": true,
    "longName": "Cerebellar hemisphere",
    "filenameKey": "cerebellarHemisphere",
    "color": "#2E97BC"
  },
  "bCells": {
    "componentIndex": 2,
    "active": true,
    "longName": "B cells",
    "filenameKey": "bCells",
    "color": "#E6009B"
  },
  ...,
  "cd34PlusProgenitor": {
    "componentIndex": 8,
    "active": true,
    "longName": "CD34+ progenitor",
    "filenameKey": "cd34PlusProgenitor",
    "color": "#ED2024"
  }
}

I would like to write a JSON object in (ascending) order of componentIndex.

Based on previous Stack Overflow questions and answers, I have tried the following statements:

jq -s 'sort_by(.[].componentIndex)' in.json > out.v1.json

And:

jq -s 'sort_by(.[].componentIndex|tonumber)' in.json > out.v2.json

The files out.v1.json and out.v2.json are identical to in.json in presented order.

What is the correct jq statement to provide the nested objects in the desired sort order?

The real question I am asking is how to extract the value of color keys in the order provided by the integers in componentIndex. But as a first step, it looks like I need to solve how to sort on integer values with jq. Thanks for your advice.


Solution

  • I would like to write a JSON object in (ascending) order of componentIndex.

    What is the correct jq statement to provide the nested objects in the desired sort order?

    You can't (reliably) sort object fields (while retaining them as an object) because an object's order of fields is not part of the information conveyed by the JSON encoding. There is, of course, an implicit order in the actual representation of the document as a character stream, but this just happens to be the one you see. Any other ordering of object fields is still considered the same document, and any JSON processor may just disregard or even re-arrange that order without giving notice.

    That said, you can, however, trick jq into generating exactly that ordering for the object by first turning it into an array representation using to_entries (which preserves the keys), then sorting that array (using .value to access its values), and then re-assembling the object using the inverse filter from_entries (which, as of the current version of jq at least, happens to reflect the input array's order). But keep in mind that this ordering is not guaranteed to be preserved or even interpreted as such by the next processor (including future versions of jq).

    jq 'to_entries | sort_by(.value.componentIndex) | from_entries'
    
    {
      "naturalKillerCells": {
        "componentIndex": 1,
        "active": true,
        "longName": "Natural killer cells",
        "filenameKey": "naturalKillerCells",
        "color": "#BB2DD4"
      },
      "bCells": {
        "componentIndex": 2,
        "active": true,
        "longName": "B cells",
        "filenameKey": "bCells",
        "color": "#E6009B"
      },
      "cd34PlusProgenitor": {
        "componentIndex": 8,
        "active": true,
        "longName": "CD34+ progenitor",
        "filenameKey": "cd34PlusProgenitor",
        "color": "#ED2024"
      },
      "lungCancerCellLines": {
        "componentIndex": 11,
        "active": true,
        "longName": "Lung cancer cell lines",
        "filenameKey": "lungCancerCellLines",
        "color": "#8CC63F"
      },
      "respiratoryMuscular": {
        "componentIndex": 12,
        "active": true,
        "longName": "Respiratory / muscular",
        "filenameKey": "respiratoryMuscular",
        "color": "#DDE223"
      },
      "cerebellarHemisphere": {
        "componentIndex": 14,
        "active": true,
        "longName": "Cerebellar hemisphere",
        "filenameKey": "cerebellarHemisphere",
        "color": "#2E97BC"
      }
    }
    

    Demo


    The real question I am asking is how to extract the value of color keys in the order provided by the integers in componentIndex.

    For that, you really just need to convert the object into a regular (you don't need the keys) array of objects using map(.) or [.[]], then sort it using sort_by(.componentIndex) (note that .componentIndex is already a number, so .componentIndex|tonumber won't make a difference here), and extract from its items the fields you're interested in:

    jq -r 'map(.) | sort_by(.componentIndex)[].color'
    
    #BB2DD4
    #E6009B
    #ED2024
    #8CC63F
    #DDE223
    #2E97BC
    

    Demo