Search code examples
jsonshellawksedjq

How to update JSON file in shell script without using jq?


I am using shell script to update JSON. Previously I used jq command to read/write an JSON object. However it seems not every bash environment support jq command. I tried to resolve this issue by manipulating the string using awk, sed and gsup.

However the syntax is so complicated as below:

update_json_key() {
    local key="$1"
    local value="$2"
    local filename="$3"

    if [[ ! -f "$filename" ]]; then
      echo "{}" > "$filename"
    fi

    SED_INPLACE=""

    if [[ "$OSTYPE" == "darwin"* ]]; then
        SED_INPLACE="-i ''"  # macOS requires an empty extension for in-place editing
    else
        SED_INPLACE="-i"     # Linux
    fi

    # Ensure the file exists and is a valid JSON object
    if [ ! -s "$filename" ] || ! grep -q "{" "$filename"; then
        echo "{}" > "$filename"
    fi

    # Check if the key exists
    if grep -q "\"$key\": " "$filename"; then
        # Key exists, update it
        # This regex ensures we don't capture more than we need by being specific with our patterns
        sed $SED_INPLACE "s/\"$key\": \"[^\"]*\"/\"$key\": \"$value\"/" "$filename"
    else
        # Key does not exist, add it
        if grep -q "^{}$" "$filename"; then
            # File has only empty JSON, directly add key without comma
            sed $SED_INPLACE "s/{}/{\"$key\": \"$value\"}/" "$filename"
        else
            # File is non-empty, append key before the last closing brace
            # This ensures that we correctly find the last closing brace even when it's not at the very end
            sed $SED_INPLACE "s/\(.*\)}$/\1, \"$key\": \"$value\"}/" "$filename"
        fi
    fi

    if [ -e "$filename''" ]; then
        rm -f "$filename''"
    fi

    if [[ -e "$filename" ]] && [[ $(wc -l < "$filename") -gt 1 ]]; then
        raw_json=$(sed -e ':a' -e 'N' -e '$!ba' -e 's/\n//g' -e 's/": \?"/": "/g' -e 's/, \?"/,"/g' "$filename")
        rm -f $filename
        echo "$raw_json" > "$filename"
    fi

    formatted_json=$(awk 'BEGIN {
        FS=",";
        print "{"
    }
    {
        gsub(/[{}]/, "");
        n = split($0, a, ",");
        for (i = 1; i <= n; i++) {
            gsub(/^[[:space:]]+|[[:space:]]+$/, "", a[i]);
            print "  " a[i] (i < n ? "," : "");
        }
    }
    END {
        print "}"
    }' "$filename")

    rm -f $filename
    echo "$formatted_json" > "$filename"
}

And this syntax not yet handled any nested JSON object iteration yet. I m thinking if there is an easier way to manipulate JSON object with shell script without using jq.


Solution

  • seems not every bash environment support jq command

    Firstly determine how common this circumstances are, then how much users can install jq and how much users MUST NOT install jq. If latter case is rare consider using jq AND informing that your script requires jq in order to work.

    tried to resolve this issue by manipulating the string using awk, sed and gsup.

    I do not know last tool named, awk and sed do not support working with JSON. There exists gawkextlib variant of GNU AWK which does support working with JSON and XML, but you need to collect dependencies and built it, requirement of doing so would add burden to users of your script.

    easier way to manipulate JSON object with shell script without using jq

    You might consider using json from python's standard library, as python (and its' standard library) are often installed at linux machines, caveat: you should first poll representative sample of your users to determine if they have jq or python or both operational.

    Simple example assuming top-most structure in your JSON file is always Object. Create upsert.py with following content

    import argparse
    import json
    
    if __name__ == '__main__':
        parser = argparse.ArgumentParser(description='UPDATE or INSERT key-value into JSON document')
        parser.add_argument('key')
        parser.add_argument('value')
        parser.add_argument('filename')
        args = parser.parse_args()
        with open(args.filename, 'r') as f:
            data = json.load(f)
        data[args.key] = args.value
        with open(args.filename, 'w') as f:
            json.dump(data, f)
    

    and let file.json content be

    {"A":"Able","B":"Baker","C":"Charlie"}
    

    then

    python upsert.py 'D' 'Dog' file.json
    

    will alter file.json content to

    {"A": "Able", "B": "Baker", "C": "Charlie", "D": "Dog"}
    

    Observe that beyond new entry for D spaces were added after : and , which is default behavior. If you wish to apply different formatting consult json.dumps docs.

    (tested in Python 2.7.18 and Python 3.10.12)