Search code examples
bashcurlmapr

using curl to call data, and grep to scrub output


I am attempting to call an API for a series of ID's, and then leverage those ID's in a bash script using curl, to query a machine for some information, and then scrub the data for only a select few things before it outputs this.

#!/bin/bash
url="http://<myserver:myport>/ws/v1/history/mapreduce/jobs"
for a in $(cat jobs.txt); do
    content="$(curl "$url/$a/counters" "| grep -oP '(FILE_BYTES_READ[^:]+:\d+)|FILE_BYTES_WRITTEN[^:]+:\d+|GC_TIME_MILLIS[^:]+:\d+|CPU_MILLISECONDS[^:]+:\d+|PHYSICAL_MEMORY_BYTES[^:]+:\d+|COMMITTED_HEAP_BYTES[^:]+:\d+'" )"
    echo "$content" >> output.txt
done

This is for a MapR project I am currently working on to peel some fields out of the API.

In the example above, I only care about 6 fields, though the output that comes from the curl command gives me about 30 fields and their values, many of which are irrelevant.

If I use the curl command in a standard prompt, I get the fields I am looking for, but when I add it to the script I get nothing.


Solution

  • Please remove quotes after $url/$a/counters" ". Like following:

    content="$(curl "$url/$a/counters | grep -oP '(FILE_BYTES_READ[^:]+:\d+)|FILE_BYTES_WRITTEN[^:]+:\d+|GC_TIME_MILLIS[^:]+:\d+|CPU_MILLISECONDS[^:]+:\d+|PHYSICAL_MEMORY_BYTES[^:]+:\d+|COMMITTED_HEAP_BYTES[^:]+:\d+'" )"